Tuesday, June 18, 2019

SQL - Search all tables which contain specific columns in one database

USE your_DB_Name;
SELECT      c.name  AS 'ColumnName' ,t.name AS 'TableName'
FROM        sys.columns c  JOIN sys.tables  t ON c.object_id = t.object_id
WHERE       (t.name like 'LA%' or t.name like '%luw%' or t.name like 'Louisana%' )
        and c.name in ('API','API_NUM','API num','Well_Serial_Num','Well_Serial_No',
'WELL_SERIA','Well Serial Number','WSN','Serial_no','Well_Serial','well Serial Num')
ORDER BY    TableName  ,ColumnName;

Thursday, June 13, 2019

Python - Web Scraping - request needs to post json parameters

There are some websites that taking json data as parameters in the url.
e.g.
https://rrcsearch3.neubus.com/esd3-rrc/index.php_module_=esd&_action_=keysearch&profile=28.

This kind of website does not need to do several get or post to get the final result even though there are lots of websites pop up before the final one, like:

What you really need is just the final page + the urlencoded json parameters:

params = urlencode({
    'json': json.dumps({"Neusearch": {"profile": "28", "api_key": "publicuser",  
           "api_sig": "68461babf138014f252e64732ef3b1a0",               
           "Searchitems": {"item": [{"key": "operator_number", "value": '243195'}]},      
           "page": 0, "pageSize": 50, "strict": "true", "saveSearch": "true"}})

})

scraper.do.http_post(
    url=context['address'] + '&' + params,   
    send_to=detail,    
    data=params,    
    context=context
)

But, remember the header should include the referer since you are not doing get and post from the very first page:

headers = {
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5).....,
    'Content-Type': 'application/x-www-form-urlencoded',    
    'Referer': 'https://XXXXXXXX/index.php?........}

Difference between json.dumps() and json.dump()
dumps是将dict转化成str格式,loads是将str转化成dict格式。
dump和load也是类似的功能,只是与文件操作结合起来了。