Archive for the ‘Database’ Category

MongoDB BSON Restore, Converting to JSON, and More MongoDB Helpful Commands

Wednesday, May 25th, 2022

MongoDB Helpful Scripts & Commands

Restoring BSON Backups

The below batch script helps you extract all .gz zipped BSON MongoDB table backup files and then restore these tables to a particular Mongo database easily:

@ECHO ON
SET SourceDir=%~dp0
cd %SourceDir%
mkdir "extracted"
mkdir "extracted\json"
FOR /R %SourceDir% %%A IN ("*.gz") DO "C:\Program Files\7-Zip\7z.exe" x "%%~A" -o"%SourceDir%\extracted"
mongorestore -d {DATABASE_NAME_TO_RESTORE_TABLES_INTO} --host localhost:27017 "extracted"

Converting MongoDB Tables and Data to Proper JSON

If you want to convert MongoDB tables and their data into JSON, you can use the below commands:

mongoexport -d {DATABASE_TO_EXPORT_FROM} --host localhost:27017 -c {TABLE_NAME_TO_EXPORT_CONVERT_INTO_JSON} --jsonArray --pretty -o "%SourceDir%\extracted\json\{TABLE_NAME_BEING_CONVERTED_TO_JSON_NAME}.json"

With older versions of MongoDB, the json file export doesn't actually contain valid JSON. To fix the $date and $numberLong properties which are invalid according to JSON specifications, you can run the below Python 2.7 script:

######################################
# About                              #
######################################

# Author:   Eric Arnol-Martin https://eamster.tk
# Purpose:  Replaces Mongo's Exported DateTime Format with Proper DateTime String Representations for Easier Import for Other Databases / Programming Languages
# Expects:  Mongo JSON Exported Pretty File.  
#			For example, a file produced by a command similar to 
#			"mongoexport -d {db_name} --host localhost:27017 -c {table_name} --jsonArray --pretty -o {table_name}.json"
# Tests:    RegEx Test Link:  https://regexr.com/68l7r
# Outputs:  Creates a copy of the input JSON file with DateTime objects replaced with their proper string representation in the same directory as the original file 
#			with the same file name suffixed with "_NEW" at the end of it.
# Sources:  https://stackoverflow.com/questions/2503413/regular-expression-to-stop-at-first-match
#			https://stackoverflow.com/questions/159118/how-do-i-match-any-character-across-multiple-lines-in-a-regular-expression

######################################
# Imports                            #
######################################

import re
from os.path import exists
import fileinput


######################################
# Actual Program                     #
######################################

path = input ("Enter path or name of file to parse: ")
prevPiece = None
content_new = ""
boolReplaceFromNextLine = False
boolHandlingLong = False

if exists(path):
	# Clear new file
	b = open(path + "_NEW", "w+")
	b.close()
	count = 0
	recordCount = 0

	for line in fileinput.input(files=path):
		count = count + 1
		if '"$date":' in line: 
			content_new = content_new[0:content_new.rindex('{')] + line.replace('"$date": {', '').replace('"$date":', '').replace('\n', '').strip();
			boolReplaceFromNextLine = True
		else:
			if boolReplaceFromNextLine:
				if '"$numberLong":' in line:
					content_new = content_new + line.replace('"$numberLong":', '').replace('\n', '').strip();
					boolReplaceFromNextLine = True
					boolHandlingLong = True
				else:
					if boolHandlingLong:
						content_new = content_new + line.replace('}', '').strip();
						boolReplaceFromNextLine = True
						boolHandlingLong = False
					else:
						boolReplaceFromNextLine = False
						content_new = content_new + line.replace('}', '').strip() + '\n';
			else:
				content_new = content_new + line		
		
		if line == '},\n' and content_new:
			recordCount = recordCount + 1
			b = open(path + "_NEW", "a+")
			b.write(content_new)
			b.close()	
			content_new = ""
			print('Record ' + str(recordCount) + ' processed... adding it to the file...')
			
		prevPiece = line
	
	if content_new:
		b = open(path + "_NEW", "a+")
		b.write(content_new)
		b.close()	
		content_new = ""
		recordCount = recordCount + 1
		print('Record ' + str(recordCount) + ' processed... adding it to the file...')