summaryrefslogtreecommitdiff
path: root/memos/WM-045.txt
diff options
context:
space:
mode:
authornetop://ウィビ <paul@webb.page>2026-04-11 14:24:49 -0700
committernetop://ウィビ <paul@webb.page>2026-04-11 14:24:49 -0700
commit8c34d810af95fae0ef846f54370a8c88bfab7123 (patch)
tree436beaf30f7b2b3f15741dd54a37e313964d1f7d /memos/WM-045.txt
initial commitHEADprimary
Diffstat (limited to 'memos/WM-045.txt')
-rw-r--r--memos/WM-045.txt247
1 files changed, 247 insertions, 0 deletions
diff --git a/memos/WM-045.txt b/memos/WM-045.txt
new file mode 100644
index 0000000..456da18
--- /dev/null
+++ b/memos/WM-045.txt
@@ -0,0 +1,247 @@
+
+
+
+
+
+
+
+Document: WM-045 P. Webb
+Category: Tutorial 2020.01.28
+
+ Migrating from MongoDB to RethinkDB
+
+Abstract
+
+ Thank me later
+
+Body
+
+ RethinkDB, seemingly on life support for quite some time, is seeing a
+ revival[1] of sorts. As such, I thought it prudent to make available
+ evergreen content for my favorite database these days. If you are
+ interested in trying RethinkDB you can check out these[2] two[3]
+ tutorials (my guide will not cover installation or setup).
+
+ 1. Preparing MongoDB exports
+
+ ```sh
+ # Command
+ mongoexport --port PORT_NUMBER --db DATABASE_NAME --collection COLLECTION_NAME --out COLLECTION_NAME-`date "+%Y-%m-%d"`.json --pretty --jsonArray
+
+ # Example
+ mongoexport --port 98765 --db dawebb --collection users --out users-`date "+%Y-%m-%d"`.json --pretty --jsonArray
+ ```
+
+ There's a bit to unpack here so I'll break it down. Keep in mind
+ that all the parameters yelling at you are *placeholders* (for you
+ to replace with your own parameters).
+
+ Actually, the placeholders are self-explanatory but the second
+ half of the command is interesting.
+
+ ```COLLECTION_NAME-`date "+%Y-%m-%d"`.json``` makes it so the
+ exported collection looks like `users-2020-01-24.json`, with the
+ date being whenever you ran the above command. Super nifty for
+ backups too.
+
+ The `--pretty` flag isn't necessary for the import into RethinkDB
+ to work, it's for *you* to inspect the export for any reason.
+
+ The last flag[4], `--jsonArray`, is the most important. For some
+ reason, MongoDB exports each item in a collection as its own
+ object *not* separated by commas. Maybe MongoDB's import process
+ doesn't choke on malformed JSON but everything else does.
+ `--jsonArray` puts the contents of the export into a single JSON
+ array. Like you'd expect by default…maybe that's just me.
+
+ NOTE: `--out` is the destination path so if you haven't prefaced
+ ```COLLECTION_NAME-`date "+%Y-%m-%d"`.json``` with a path, the
+ export will be in your home directory.
+
+ Anyhoo once you've exported the collections you care about, SFTP
+ into that server to grab them and place them on your Desktop so
+ you don't have a brain fart and forget where you put them
+ moments later.
+
+ 2. Migrating, phase 01
+
+ MongoDB comes with some oddities that you may not want in your new
+ database. Notably, how it deals with IDs. Here's an example:
+
+ ```json
+ {
+ "_id": {
+ "$oid": "6bf9b676c24869077c37f61e"
+ },
+ "admin": true,
+ "dashboard": [],
+ "language": "en_US",
+ "loginMethod": "link",
+ "nameFirst": "",
+ "nameLast": "",
+ "plan": "free",
+ "summaries": [],
+ "timezone": "gmt-05-02",
+ "verified": true,
+ "email": "user@domain.tld",
+ "__v": 0
+ }
+ ```
+
+ In RethinkDB IDs are simply `id` and you have no need for `__v` so
+ you probably don't want these values in your shiny new database.
+ Also, you may have decided to use this migration period switch up
+ your schema. Combine `nameFirst` with `nameLast`? Drop `plan`?
+ Update `timzeone`? Replace `createdAt` with `created`? Regardless,
+ you're gonna need to do a bit of legwork to clean your
+ MongoDB export(s).
+
+ The entire script I use is hosted here[5] but I'll point out some
+ relevant pieces.
+
+ If you have any fields with dates/milliseconds, your import will
+ fail unless you wrap those fields in `new Date` like so:
+
+ ```json
+ …,
+ timestamp: new Date(timestamp),
+ …,
+ ```
+
+ To reuse the IDs that were generated in MongoDB for usage in
+ RethinkDB, you're gonna need to do something like this:
+
+ ```json
+ …,
+ id: record._id["$oid"],
+ …,
+ ```
+
+ You'll also need to make sure to explicity select the fields you
+ want to transfer into your new export. The gist linked above
+ should answer remaining questions you may have.
+
+ 3. Importing into RethinkDB
+
+ Even though you've already installed RethinkDB, you need to
+ install the Python driver[6] as well (for importing functionality,
+ at least I had to do this for macOS).
+
+ Also, make sure you are importing your newly processed/migrated
+ data into RethinkDB, not the original nonsense from your MongoDB
+ export (unless of course, that's your plan).
+
+ ```sh
+ # Command
+ rethinkdb import -f PATH_TO_PROCESSED_EXPORT_FILE --table DATABASE.TABLE -c CONNECTION_URL --password-file PASSWORD_FILE --force
+
+ # Example
+ rethinkdb import -f ~/Desktop/migrated/users-2020-01-24.json --table dawebb.users -c localhost:98765 --password-file ~/Desktop/rethinkpass.txt --force
+ ```
+
+ If you don't have a password on your RethinkDB database, you can
+ safely omit the `--password-file` flag. Otherwise, make sure the
+ password file only contains the password. If your IDE
+ automatically generates new lines in files, just create the
+ password file with `nano`.
+
+ Make sure you run the above command while RethinkDB is running and
+ you'll see freshly created tables successfully created.
+
+ 4. Migrating, phase 02
+
+ Alright, we're almost at the finish line!
+
+ One of the neat things about RethinkDB (and a feature that
+ convinced me to make the jump) is its Data Explorer. It's a UI
+ that allows you to manipulate or check out your tables. There are
+ just two remaining things we need to do and they're quick and
+ easy: 1) set up indexes for our tables and 2) update time-based
+ data to a format RethinkDB likes.
+
+ Visit `http://localhost:8080` (default port, unless you changed
+ it) and click on "Data Explorer" in the header. In the text field
+ you'll be able to perform queries using JavaScript.
+
+ 4.1. Setting up indexes
+
+ By default `id` is an index but you may want more. Indexes are for
+ fields with unique values so it's easy to think of which field(s)
+ would be suitable.
+
+ Sometimes, only the ID would be unique and that's fine.
+
+ ```js
+ // Command
+ r.db("DATABASE_NAME").table("TABLE_NAME").index_create("FIELD_WITH_UNIQUE_VALUE");
+
+ // Examples
+ r.db("dawebb").table("users").index_create("email");
+ r.db("dawebb").table("posts").index_create("slug");
+ ```
+
+ Now let's update our time-based fields:
+
+ ```js
+ // Command
+ r.db("DATABASE_NAME").table("TABLE_NAME").update({
+ created: r.ISO8601(r.row("created")),
+ updated: r.ISO8601(r.row("updated"))
+ });
+
+ // Examples
+ r.db("dawebb").table("users").update({
+ created: r.ISO8601(r.row("created")),
+ updated: r.ISO8601(r.row("updated"))
+ });
+
+ r.db("dawebb").table("visits").update({
+ timestamp: r.ISO8601(r.row("timestamp"))
+ });
+ ```
+
+ FIN
+
+ And there you have it! A super easy guide to move from MongoDB to
+ RethinkDB. I've been using RethinkDB for several months now and I
+ am way happier than I was with MongoDB. While super easy to get
+ into, once you get in too deep it becomes an exercise in
+ frustration to find solutions to ambiguous errors and the MongoDB
+ docs are not user-friendly.
+
+ Contrast that with RethinkDB's Data Explorer, clear error
+ messages, and clean documentation and it's not difficult to
+ imagine why I'd make the switch. 🕸
+
+ P.S. New year, new projects[7], and now I feel like I need a new
+ design for this blog. And then I remembered that first I need to
+ create a personal API[8] so this blog can just become the
+ presentation layer for the content.
+
+ **2020.01.30 update**
+
+ > Another reason to migrate is the license of MongoDB: SSPL vs.
+ > Apache 2 of RethinkDB.
+ > — af[9]
+
+ For others who may not know what SSPL[10] entails (like me until I
+ read the linked post):
+
+ Basically, SSPL means one cannot offer MongoDB as a hosted
+ service. That makes sense from their end as they offer hosting.
+ However, it's a bit of a punk move because they are preventing
+ potential competition from forcing them to improve
+ their product. 🕸
+
+References
+
+ [1] <https://rethinkdb.com/blog/2.4.0-release>
+ [2] <https://pusher.com/tutorials/live-node-rethinkdb>
+ [3] <https://www.pluralsight.com/guides/a-practical-introduction-to-rethinkdb>
+ [4] <https://docs.mongodb.com/manual/reference/program/mongoexport/#cmdoption-mongoexport-jsonarray>
+ [5] <https://gist.github.com/NetOperatorWibby/5084bf5c64306093e067fc43cfa4fcdb>
+ [6] <https://rethinkdb.com/docs/install-drivers/python>
+ [7] <https://socii.network/NetOpWibby/status/e3HWCaoqTZYzZvZ47RXfp>
+ [8] </WM-042>
+ [9] <https://lobste.rs/u/af>
+ [10] <https://lukasatkinson.de/2019/mongodb-no-longer-seeks-osi-approval-for-sspl>