and then copy just a set of numbers that appear on after the = sign only for links that are like mypage.php?REF=23273273
Then i want to put them in a list (in a subsequent code) with will take each number from that list
and put that in a paragraph which will copy it self.
then i want to print such paragraphs in a txt file.
The desired output is
5646556
6564654
454654
4646546
and so on
This is the code i am working with.
Syntax: Select all
from bs4 import BeautifulSoup
import urllib2
import re
url = "somewebsite"
headers = { 'User-Agent' : 'Mozilla/5.0' }
html = urllib2.urlopen(urllib2.Request(url, None, headers)).read()
soup = BeautifulSoup(html)
links = soup.findAll('a', href=re.compile('.*mypage\.php\?REF=[0-9]*'))
template = """lasljasfkljaslkfj{}
slajfljasflk
aslkjfklasjflkasjf
alksjflkasjf;lk
"""
replace = [ link.split("=")[1] for link in links ]
output = [template.format(r) for r in replace]
print output
with open('output.txt', 'w') as f_output:
f_output.write(''.join([template.format(r) for r in replace]))
Here was the other half of the original program. This program just takes numbers from a list that u have to ype and it puts each of those numbers in a paragraph and then copies that paragraph with the next number being inserted from the list.
Syntax: Select all
template = """fjajflakjfakjfl;kj REF={}
sklkasalsjklas
klajsl;kdajs;djas
aksljl;askjflka
"""
replace = [1131062,
1140921,
1141326,
1141355,
1141426,
1141430,
1141461,
1141473,
1141477,
1141502,
1141525,
1141622,
1141662,
757053,
989967]
output = [template.format(r) for r in replace]
with open('output.txt', 'w') as f_output:
f_output.write(''.join([template.format(r) for r in replace]))