Learning Python, need suggestions

vkozinec · Mar-22-2017, 05:12 PM

Hello everyone,

I am trying to learn a bit about Python since I researched and saw it's a really powerful language.

So for my first "hello world" project ( well, maybe this one is a bit more advanced ) I decided to make myself a nice script that will download all the images from certain facebook page ( group ).

First, I started with this nice example of getting .csv file from facebook graph data with this awesome python code: Facebook Page Post Scraper from minimaxir ( can't link since it's my 1st post, sorry author ! )

That was my starting point. So from this I wanted to extract all the photo links and download images, simple :)

This is the code I wrote and it works ! It works but for sure this has so many holes and things could improve.

import csv
import urllib.request
from collections import defaultdict

columns = defaultdict(list) # each value in each column is appended to a list

with open('disu.txt') as f:
    reader = csv.DictReader(f) # read rows into a dictionary format
    for row in reader: # read a row as {column1: value1, column2: value2,...}
        for (k,v) in row.items(): # go over each column name and value 
            columns[k].append(v) # append the value into the appropriate list based on column name k

strings = columns['status_link'] # take only status_link column out of list

# loop and remove any of '' empty values inside list if any exists. This is to avoid errors in further loop if there is no url to download.
while True:
  try:
    strings.remove("")
  except ValueError:
    break

# split list value ( it's always the same URL ), and take only img ID from url 
strings = [i.rsplit('/', 2)[-2] for i in strings]

# fint length of list, so we can loop trough all values
sumtotal = len(strings)
count = 0

#while (count < sumtotal-1):    this is "while" for automatic loop based on list length, now we just take few elements to test

while (count < 20):
count = count + 1
one = strings[count]
one = one.rsplit('/', 1)[-1]
newurl = ('https://graph.facebook.com/'+ one +'/picture')
urllib.request.urlretrieve(newurl, 'slike/test'+ str(count) +'.jpg')
print(newurl, "Downloaded")

I am aware this is bad way of downloading images, since:

I don't use API that facebook provides. I used csv file from another script mentioned above
I found way that using ('https://graph.facebook.com/'+ one +'/picture') redirects to real image and it works ! But probably is a bad way to do it
I don't have any checkups for 404 or 500 errors, so if that happens, my scripts stops.
Also mind that I just started to learn programming and python, so loops above may be soooo wrong, but that's why I am here

What I want from You is to help me with code fix suggestions if that's possible. ( btw, spaces in 2nd while loop are messed in this code output, I tried to fix but idk, BUT code works )

wavic · (This post was last modified: Mar-22-2017, 06:44 PM by wavic.)

You should use the facebook api.

The while loop can be replaced with for loop:

# enumerate returns sequence number for each iteration
# string[:20] gets the first 20 elements. This is called slicing*
for count, one in enumerate(strings[:20]):
    one = one.rsplit('/', 1)[-1]
    newurl = ('https://graph.facebook.com/'+ one +'/picture')

    try:
        urllib.request.urlretrieve(newurl, 'slike/test{}.jpg'.format(str(count)))
        print(newurl, "Downloaded")
    except Exception as e:
        print(type(e))
        print(inst.e)
        print(e)

* slicing

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Logic suggestions for comparing 2 csv's	cubangt	7	3,612	Nov-09-2023, 09:54 PM Last Post: cubangt
	Require Some Suggestions	gouravlal	2	3,109	Jul-27-2020, 06:14 AM Last Post: gouravlal
	Python Debugger Suggestions	nilamo	3	4,732	Oct-22-2018, 07:05 PM Last Post: jdjeffers

Learning Python, need suggestions

User Panel Messages

Announcements