Dec-10-2022, 03:53 AM
Greetings...
I'm working on validating large json file with millions of records in batches using jsonschema Draft202012Validator.
I'm working on validating large json file with millions of records in batches using jsonschema Draft202012Validator.
import json
from azure.storage.blob import BlobServiceClient
from azure.identity import DefaultAzureCredential
account = "myAccount"
container = "myContainer"
blob_name = "myBlob.json"
default_credential = DefaultAzureCredential()
blob_service_client = BlobServiceClient(account, credential=default_credential)
container_client = blob_service_client.get_container_client(container)
blob_client = container_client.get_blob_client(blob_name)
data = bytearray(blob_client.download_blob().readall())
batch_size = 1000
process = json.loads(data)
for batch in [process[i:i+batch_size] for i in range(0, len(process), batch_size)]:
# process dataI'm running into convert object of type bytearray to string JSON serializable error.
