MongoDB BSON Restore, Converting to JSON, and More MongoDB Helpful Commands
Wednesday, May 25th, 2022MongoDB Helpful Scripts & Commands
Restoring BSON Backups
The below batch script helps you extract all .gz zipped BSON MongoDB table backup files and then restore these tables to a particular Mongo database easily:
@ECHO ON SET SourceDir=%~dp0 cd %SourceDir% mkdir "extracted" mkdir "extracted\json" FOR /R %SourceDir% %%A IN ("*.gz") DO "C:\Program Files\7-Zip\7z.exe" x "%%~A" -o"%SourceDir%\extracted" mongorestore -d {DATABASE_NAME_TO_RESTORE_TABLES_INTO} --host localhost:27017 "extracted"
Converting MongoDB Tables and Data to Proper JSON
If you want to convert MongoDB tables and their data into JSON, you can use the below commands:
mongoexport -d {DATABASE_TO_EXPORT_FROM} --host localhost:27017 -c {TABLE_NAME_TO_EXPORT_CONVERT_INTO_JSON} --jsonArray --pretty -o "%SourceDir%\extracted\json\{TABLE_NAME_BEING_CONVERTED_TO_JSON_NAME}.json"
With older versions of MongoDB, the json file export doesn't actually contain valid JSON. To fix the $date and $numberLong properties which are invalid according to JSON specifications, you can run the below Python 2.7 script:
###################################### # About # ###################################### # Author: Eric Arnol-Martin https://eamster.tk # Purpose: Replaces Mongo's Exported DateTime Format with Proper DateTime String Representations for Easier Import for Other Databases / Programming Languages # Expects: Mongo JSON Exported Pretty File. # For example, a file produced by a command similar to # "mongoexport -d {db_name} --host localhost:27017 -c {table_name} --jsonArray --pretty -o {table_name}.json" # Tests: RegEx Test Link: https://regexr.com/68l7r # Outputs: Creates a copy of the input JSON file with DateTime objects replaced with their proper string representation in the same directory as the original file # with the same file name suffixed with "_NEW" at the end of it. # Sources: https://stackoverflow.com/questions/2503413/regular-expression-to-stop-at-first-match # https://stackoverflow.com/questions/159118/how-do-i-match-any-character-across-multiple-lines-in-a-regular-expression ###################################### # Imports # ###################################### import re from os.path import exists import fileinput ###################################### # Actual Program # ###################################### path = input ("Enter path or name of file to parse: ") prevPiece = None content_new = "" boolReplaceFromNextLine = False boolHandlingLong = False if exists(path): # Clear new file b = open(path + "_NEW", "w+") b.close() count = 0 recordCount = 0 for line in fileinput.input(files=path): count = count + 1 if '"$date":' in line: content_new = content_new[0:content_new.rindex('{')] + line.replace('"$date": {', '').replace('"$date":', '').replace('\n', '').strip(); boolReplaceFromNextLine = True else: if boolReplaceFromNextLine: if '"$numberLong":' in line: content_new = content_new + line.replace('"$numberLong":', '').replace('\n', '').strip(); boolReplaceFromNextLine = True boolHandlingLong = True else: if boolHandlingLong: content_new = content_new + line.replace('}', '').strip(); boolReplaceFromNextLine = True boolHandlingLong = False else: boolReplaceFromNextLine = False content_new = content_new + line.replace('}', '').strip() + '\n'; else: content_new = content_new + line if line == '},\n' and content_new: recordCount = recordCount + 1 b = open(path + "_NEW", "a+") b.write(content_new) b.close() content_new = "" print('Record ' + str(recordCount) + ' processed... adding it to the file...') prevPiece = line if content_new: b = open(path + "_NEW", "a+") b.write(content_new) b.close() content_new = "" recordCount = recordCount + 1 print('Record ' + str(recordCount) + ' processed... adding it to the file...')