Please help Scrap a list of numbers from the source code of a webpage ?

All other Source.Python topics and issues.
krishna108
Junior Member
Posts: 2
Joined: Sat Sep 19, 2015 2:50 pm

Please help Scrap a list of numbers from the source code of a webpage ?

Postby krishna108 » Sat Sep 19, 2015 2:54 pm

1. I want the program to read the website source

and then copy just a set of numbers that appear on after the = sign only for links that are like mypage.php?REF=23273273

Then i want to put them in a list (in a subsequent code) with will take each number from that list

and put that in a paragraph which will copy it self.

then i want to print such paragraphs in a txt file.

The desired output is

5646556
6564654
454654
4646546

and so on

This is the code i am working with.

Syntax: Select all

from bs4 import BeautifulSoup
import urllib2
import re

url = "somewebsite"

headers = { 'User-Agent' : 'Mozilla/5.0' }
html = urllib2.urlopen(urllib2.Request(url, None, headers)).read()
soup = BeautifulSoup(html)

links = soup.findAll('a', href=re.compile('.*mypage\.php\?REF=[0-9]*'))
template = """lasljasfkljaslkfj{}
slajfljasflk
aslkjfklasjflkasjf
alksjflkasjf;lk
"""

replace = [ link.split("=")[1] for link in links ]

output = [template.format(r) for r in replace]

print output
with open('output.txt', 'w') as f_output:
f_output.write(''.join([template.format(r) for r in replace]))


Here was the other half of the original program. This program just takes numbers from a list that u have to ype and it puts each of those numbers in a paragraph and then copies that paragraph with the next number being inserted from the list.

Syntax: Select all

template = """fjajflakjfakjfl;kj REF={}
sklkasalsjklas
klajsl;kdajs;djas
aksljl;askjflka
"""

replace = [1131062,
1140921,
1141326,
1141355,
1141426,
1141430,
1141461,
1141473,
1141477,
1141502,
1141525,
1141622,
1141662,
757053,
989967]

output = [template.format(r) for r in replace]

with open('output.txt', 'w') as f_output:
f_output.write(''.join([template.format(r) for r in replace]))
User avatar
Ayuto
Project Leader
Posts: 2195
Joined: Sat Jul 07, 2012 8:17 am
Location: Germany

Postby Ayuto » Sat Sep 19, 2015 9:34 pm

Just wondering: what's the question? Or where do you have problems?
krishna108
Junior Member
Posts: 2
Joined: Sat Sep 19, 2015 2:50 pm

Postby krishna108 » Sun Sep 20, 2015 7:42 am

Ayuto wrote:Just wondering: what's the question? Or where do you have problems?


which part was misunderstood , pls tell me i will clarify
ty
User avatar
Ayuto
Project Leader
Posts: 2195
Joined: Sat Jul 07, 2012 8:17 am
Location: Germany

Postby Ayuto » Sun Sep 20, 2015 3:09 pm

You just posted code, but didn't say what's wrong with it. What's your question?

Return to “General Discussion”

Who is online

Users browsing this forum: No registered users and 21 guests