Python Basic steps of crawling web data :
*
from urllib import request
response = request.urlopen(‘ Full URL ’)
*
import requests
import chardet
url = ‘ Full URL ’
response = requests.get(url)
response.encoding = chardet.detect(response.content)[‘encoding’]
# text
html = response.text
*
selenium ( Dynamically loaded Webpage , That's it )
from selenium import webdriver
*
scrapy frame
----- extract content ------
Generally through Browsing console , Look for it first Unified structure . Then find the parent element
1. regular expression
2. beautifulsoup
3. selenium The related methods of
4. xpath
----- storage content -------
1. txt
2. csv
3. excel
4. mongodb
5. mysql
Technology