FREE PYTHON CODES

1.Program to get all the text from a site and store it in a .txt file

(Requires BeautifulSoup )

import urllib

import re

from bs4 import BeautifulSoup

var1=raw_input('Enter url:   ')


url =var1

html = urllib.urlopen(url).read()

soup = BeautifulSoup(html)



for script in soup(["script", "style"]):

    script.extract()   




text = soup.get_text()

lines = (line.strip() for line in text.splitlines())

chunks = (phrase.strip() for line in lines for phrase in line.split("  "))

text = '\n'.join(chunk for chunk in chunks if chunk)

text = text.encode('ascii', 'ignore').decode('ascii')


print('Text copied to Data.txt file')

fhand=open('Data.txt','w')

fhand.write(text)

fhand.close()

Regex code to find all email address :

1	re.findall(r"(^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$)",string_name)

Program to find a particular file type (eg. .mp3,.doc,.mp4 etc) in the PC along with file path

import fnmatch
import os
rootdir='/'
var1=raw_input('Enter file type to search  prefix  * :  ')
pattern=var1
for root,subdirname,filelist in os.walk(rootdir):
        for filename in fnmatch.filter(filelist,pattern):
                t=(os.path.join(root,filename))
                print(t)
print('ALL FILES FOUND')

converting doc,docx files to txt files

FREE PYTHON CODES

a

Friday, 17 June 2016

No comments:

Post a Comment

Blog Archive

About Me