summaryrefslogtreecommitdiff
path: root/memos/WM-045.txt
blob: 456da18f6b72fdf26ae69baf0f74036ca679c7af (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
Document: WM-045                                                 P. Webb
Category: Tutorial                                            2020.01.28

                  Migrating from MongoDB to RethinkDB

Abstract

   Thank me later

Body

   RethinkDB, seemingly on life support for quite some time, is seeing a
   revival[1] of sorts. As such, I thought it prudent to make available
   evergreen content for my favorite database these days. If you are
   interested in trying RethinkDB you can check out these[2] two[3]
   tutorials (my guide will not cover installation or setup).

   1. Preparing MongoDB exports

      ```sh
      # Command
      mongoexport --port PORT_NUMBER --db DATABASE_NAME --collection COLLECTION_NAME --out COLLECTION_NAME-`date "+%Y-%m-%d"`.json --pretty --jsonArray

      # Example
      mongoexport --port 98765 --db dawebb --collection users --out users-`date "+%Y-%m-%d"`.json --pretty --jsonArray
      ```

      There's a bit to unpack here so I'll break it down. Keep in mind
      that all the parameters yelling at you are *placeholders* (for you
      to replace with your own parameters).

      Actually, the placeholders are self-explanatory but the second
      half of the command is interesting.

      ```COLLECTION_NAME-`date "+%Y-%m-%d"`.json``` makes it so the
      exported collection looks like `users-2020-01-24.json`, with the
      date being whenever you ran the above command. Super nifty for
      backups too.

      The `--pretty` flag isn't necessary for the import into RethinkDB
      to work, it's for *you* to inspect the export for any reason.

      The last flag[4], `--jsonArray`, is the most important. For some
      reason, MongoDB exports each item in a collection as its own
      object *not* separated by commas. Maybe MongoDB's import process
      doesn't choke on malformed JSON but everything else does.
      `--jsonArray` puts the contents of the export into a single JSON
      array. Like you'd expect by default…maybe that's just me.

      NOTE: `--out` is the destination path so if you haven't prefaced
      ```COLLECTION_NAME-`date "+%Y-%m-%d"`.json``` with a path, the
      export will be in your home directory.

      Anyhoo once you've exported the collections you care about, SFTP
      into that server to grab them and place them on your Desktop so
      you don't have a brain fart and forget where you put them
      moments later.

   2. Migrating, phase 01

      MongoDB comes with some oddities that you may not want in your new
      database. Notably, how it deals with IDs. Here's an example:

      ```json
      {
        "_id": {
          "$oid": "6bf9b676c24869077c37f61e"
        },
        "admin": true,
        "dashboard": [],
        "language": "en_US",
        "loginMethod": "link",
        "nameFirst": "",
        "nameLast": "",
        "plan": "free",
        "summaries": [],
        "timezone": "gmt-05-02",
        "verified": true,
        "email": "user@domain.tld",
        "__v": 0
      }
      ```

      In RethinkDB IDs are simply `id` and you have no need for `__v` so
      you probably don't want these values in your shiny new database.
      Also, you may have decided to use this migration period switch up
      your schema. Combine `nameFirst` with `nameLast`? Drop `plan`?
      Update `timzeone`? Replace `createdAt` with `created`? Regardless,
      you're gonna need to do a bit of legwork to clean your
      MongoDB export(s).

      The entire script I use is hosted here[5] but I'll point out some
      relevant pieces.

      If you have any fields with dates/milliseconds, your import will
      fail unless you wrap those fields in `new Date` like so:

      ```json
      …,
      timestamp: new Date(timestamp),
      …,
      ```

      To reuse the IDs that were generated in MongoDB for usage in
      RethinkDB, you're gonna need to do something like this:

      ```json
      …,
      id: record._id["$oid"],
      …,
      ```

      You'll also need to make sure to explicity select the fields you
      want to transfer into your new export. The gist linked above
      should answer remaining questions you may have.

   3. Importing into RethinkDB

      Even though you've already installed RethinkDB, you need to
      install the Python driver[6] as well (for importing functionality,
      at least I had to do this for macOS).

      Also, make sure you are importing your newly processed/migrated
      data into RethinkDB, not the original nonsense from your MongoDB
      export (unless of course, that's your plan).

      ```sh
      # Command
      rethinkdb import -f PATH_TO_PROCESSED_EXPORT_FILE --table DATABASE.TABLE -c CONNECTION_URL --password-file PASSWORD_FILE --force

      # Example
      rethinkdb import -f ~/Desktop/migrated/users-2020-01-24.json --table dawebb.users -c localhost:98765 --password-file ~/Desktop/rethinkpass.txt --force
      ```

      If you don't have a password on your RethinkDB database, you can
      safely omit the `--password-file` flag. Otherwise, make sure the
      password file only contains the password. If your IDE
      automatically generates new lines in files, just create the
      password file with `nano`.

      Make sure you run the above command while RethinkDB is running and
      you'll see freshly created tables successfully created.

   4. Migrating, phase 02

      Alright, we're almost at the finish line!

      One of the neat things about RethinkDB (and a feature that
      convinced me to make the jump) is its Data Explorer. It's a UI
      that allows you to manipulate or check out your tables. There are
      just two remaining things we need to do and they're quick and
      easy: 1) set up indexes for our tables and 2) update time-based
      data to a format RethinkDB likes.

      Visit `http://localhost:8080` (default port, unless you changed
      it) and click on "Data Explorer" in the header. In the text field
      you'll be able to perform queries using JavaScript.

      4.1. Setting up indexes

      By default `id` is an index but you may want more. Indexes are for
      fields with unique values so it's easy to think of which field(s)
      would be suitable.

      Sometimes, only the ID would be unique and that's fine.

      ```js
      // Command
      r.db("DATABASE_NAME").table("TABLE_NAME").index_create("FIELD_WITH_UNIQUE_VALUE");

      // Examples
      r.db("dawebb").table("users").index_create("email");
      r.db("dawebb").table("posts").index_create("slug");
      ```

      Now let's update our time-based fields:

      ```js
      // Command
      r.db("DATABASE_NAME").table("TABLE_NAME").update({
        created: r.ISO8601(r.row("created")),
        updated: r.ISO8601(r.row("updated"))
      });

      // Examples
      r.db("dawebb").table("users").update({
        created: r.ISO8601(r.row("created")),
        updated: r.ISO8601(r.row("updated"))
      });

      r.db("dawebb").table("visits").update({
        timestamp: r.ISO8601(r.row("timestamp"))
      });
      ```

   FIN

      And there you have it! A super easy guide to move from MongoDB to
      RethinkDB. I've been using RethinkDB for several months now and I
      am way happier than I was with MongoDB. While super easy to get
      into, once you get in too deep it becomes an exercise in
      frustration to find solutions to ambiguous errors and the MongoDB
      docs are not user-friendly.

      Contrast that with RethinkDB's Data Explorer, clear error
      messages, and clean documentation and it's not difficult to
      imagine why I'd make the switch. 🕸

      P.S. New year, new projects[7], and now I feel like I need a new
      design for this blog. And then I remembered that first I need to
      create a personal API[8] so this blog can just become the
      presentation layer for the content.

      **2020.01.30 update**

      > Another reason to migrate is the license of MongoDB: SSPL vs.
      > Apache 2 of RethinkDB.
      > — af[9]

      For others who may not know what SSPL[10] entails (like me until I
      read the linked post):

      Basically, SSPL means one cannot offer MongoDB as a hosted
      service. That makes sense from their end as they offer hosting.
      However, it's a bit of a punk move because they are preventing
      potential competition from forcing them to improve
      their product. 🕸

References

   [1]  <https://rethinkdb.com/blog/2.4.0-release>
   [2]  <https://pusher.com/tutorials/live-node-rethinkdb>
   [3]  <https://www.pluralsight.com/guides/a-practical-introduction-to-rethinkdb>
   [4]  <https://docs.mongodb.com/manual/reference/program/mongoexport/#cmdoption-mongoexport-jsonarray>
   [5]  <https://gist.github.com/NetOperatorWibby/5084bf5c64306093e067fc43cfa4fcdb>
   [6]  <https://rethinkdb.com/docs/install-drivers/python>
   [7]  <https://socii.network/NetOpWibby/status/e3HWCaoqTZYzZvZ47RXfp>
   [8]  </WM-042>
   [9]  <https://lobste.rs/u/af>
   [10] <https://lukasatkinson.de/2019/mongodb-no-longer-seeks-osi-approval-for-sspl>