Jan-02-2025, 08:29 PM
(This post was last modified: Jan-02-2025, 09:21 PM by Yoriz.
Edit Reason: Added code tags
)
How do I convert the following code to use the offset limit items and get a large item instead of doing a get_all type operation? The code I want to convert is below. There seems to be a way of paginating the download of the data but then creating the DataFrame at the end and just using that instead of going through the records thing you see below:
Something like:
Thanks
client = Socrata(nmdx_url, token, username, password, timeout=90)
item_count += 1
# Process metadata
ds_md = client.get_metadata(ds)
print('=[ ' + str(item_count) + ' ]==================================================================================================================')
print(f"ID: {ds_md['id']}")
print(f"NAME: {ds_md['name']}")
print(f"CATEGORY: {ds_md['category']}")
print(f"DESCRIPTION: {ds_md['description']}")
#print(f"rowsUpdatedAt: {ds_md['rowsUpdatedAt']} ({datetime.fromtimestamp(ds_md['rowsUpdatedAt'])})")
# Fetch records from the table
ds_md = client.get_metadata(ds)
ds_cols = [colname['fieldName'] for colname in ds_md['columns']]
ds_records = client.get(ds, where='', limit=10)
#ds_records = client.get_all(ds, where='')
#ds_records = client.get_all(ds)
ds_df = pd.DataFrame.from_records(ds_records, columns=ds_cols)How do you do this without the ds_records thing here and I resolve cols?Something like:
chunk = client.get(limit=chunk_size, offset=offset)
if not chunk: # Check for empty response
break
all_data.extend(chunk)
offset += chunk_sizeAgain, I'm trying to do the client.get() where the get happens many times with limited data to keep the network traffic reasonable. How is this done?Thanks
