Monday, March 26, 2018

SQL - SCHEMABING thing

To check which views are using a table, you can use:

SELECT VIEW_NAME
FROM INFORMATION_SCHEMA.VIEW_TABLE_USAGE
WHERE TABLE_NAME = 'tbl_well'

To check a certain column in a table being used by other process, you can use:

SELECT DISTINCT *
FROM sys.all_sql_modules
WHERE OBJECT_DEFINITION(OBJECT_ID) LIKE '%tbl_well%'
AND OBJECT_DEFINITION(OBJECT_ID) LIKE '%latitude%';

By including the SCHEMABINDING clause you protect the view/function from unexpected changes to the tables underneath them:

CREATE VIEW [dbo].[v_QC_well]

WITH SCHEMABINDING
...

Reference: 

https://sqlstudies.com/2014/08/06/schemabinding-what-why/

https://www.mssqltips.com/sqlservertip/1692/using-schema-binding-to-improve-sql-server-udf-performance/

Python - Some concepts used in Scrapy, BeautifulSoup, Mechanize

Reference:
CSS https://www.w3schools.com/css/default.asp
XPathhttps://www.w3schools.com/xml/xpath_intro.asp
yieldhttps://stackoverflow.com/questions/231767/what-does-the-yield-keyword-do

Examplehttps://www.digitalocean.com/community/tutorials/how-to-scrape-web-pages-with-beautiful-soup-and-python-3

Module "Mechanize"documentation - http://wwwsearch.sourceforge.net/mechanize/faq.html (Python 3 is not yet supported)

Wednesday, March 14, 2018

SQL - Want your query run faster?

Sometimes if you want to check if data in table A not exist in table B, but there is no data in the table B. If table A has large datasets, usually it takes very long time for the query to finish. Here is an alternative way to use not exists

set statistics time on 

select top 100000 a.*
from xxxxx.dbo.tbl_well_bridge a
     left join #Temp b  on a.api = b.api
where b.api is null;

select  top 100000 a.*
from xxxxxx.dbo.tbl_well_bridge a
where api not in (select api from #Temp)

select  top 100000  *
from xxxxx.dbo.tbl_well_bridge a
where not exists (select api from #Temp b where a.api = b.api )

set statistics time off

And the comparison is below. You can see clearly that which one is saving more time in total!


(100000 rows affected)

 SQL Server Execution Times:
   CPU time = 140 ms,  elapsed time = 22396 ms.

(100000 rows affected)

 SQL Server Execution Times:
   CPU time = 63 ms,  elapsed time = 8155 ms.

(100000 rows affected)

 SQL Server Execution Times:
   CPU time = 78 ms,  elapsed time = 6028 ms.

Monday, March 12, 2018

SQL - Want to make your Alias Column Name dynamic?

I was dealing with a not very cleaned CSV file, which is comma delimited, with "" around each column data, yet, data itself has comma, and worse, even double quote.

sample data:










It is hard for SSIS packages to load such a CSV file clean without column shifting. So what I did was to load the data into just one column, and try to separate them to each column in SQL.

From
To


















Here is the way I do it: I combined cursor and dynamic SQL to make the column alias name dynamic:

/****** Script for SelectTopNRows command from SSMS  ******/
IF OBJECT_ID('tempdb..#Names') IS NOT NULL DROP TABLE #Names;

create table #Names ( ColID INT, ColName VARCHAR(50));

INSERT INTO #Names (ColId, ColName)   
Values(1,'[APINo]'),  (2,'[FileNo]'),(3,'[CurrentOperator]'), (4,'[CurrentWellName]'), (5,'[LeaseName]'),(6,'[LeaseNumber]'),
      (7,'[OriginalOperator]'),(8,'[OriginalWellName]'), (9,'[SpudDate]'),(10,'[TD]'),  (11,'[CountyName]'), (12,'[Township]'),
      (13,'[Range]'),(14,'[Section]'), (15,'[QQ]'), (16,'[Footages]'),(17,'[FieldName]'), (18,'[ProducedPools]'),
      (19,'[OilWaterGasCums]'),(20,'[IPTDateOilWaterGas]'),(21,'[Wellbore]'),(22,'[Latitude]'), (23,'[Longitude]'),
      (24,'[WellType]'), (25,'[WellStatus]'), (26,'[CTB]'),  (27,'[WellStatusDate]');
  
SELECT ColId, ColName 
from #Names;
-------------Calculate distance based on county---------------
DECLARE @ColId INT;
DECLARE CURSOR_1 CURSOR 
LOCAL STATIC READ_ONLY FORWARD_ONLY
FOR 

SELECT DISTINCT ColId
FROM  #Names

OPEN CURSOR_1;

FETCH NEXT FROM CURSOR_1 INTO @ColId;
WHILE @@FETCH_STATUS = 0
BEGIN 
    
DECLARE @ColName VARCHAR(50);
DECLARE CURSOR_2 CURSOR 
FOR 

SELECT DISTINCT ColName
FROM  #Names 
WHERE colId =   @ColId

OPEN CURSOR_2;
FETCH NEXT FROM CURSOR_2 INTO @ColName;
WHILE @@FETCH_STATUS = 0
BEGIN 

DECLARE @sql NVARCHAR(max)
DECLARE @column_alias VARCHAR(50)
SET @column_alias = @ColName
SET @sql = '

with tbl_raw as(
SELECT *
from [dbo].[ND_WellIndex_Load]
), tbl_extract as (
select a.id,right(a.raw_data,len(a.raw_data)-charindex(''",'',a.raw_data)-1) as raw_data,left(a.raw_data,charindex(''",'',a.raw_data)) as' + @column_alias +
'from tbl_raw a 
)
update a
set a.raw_Data = b.raw_data ,a.' + @column_alias + '= b.' + @column_alias +
'from tbl_raw a,
tbl_extract b
where a.id = b.id '

exec sp_executesql @sql

FETCH NEXT FROM CURSOR_2 INTO @ColName;
END;
CLOSE CURSOR_2;
DEALLOCATE CURSOR_2;

        FETCH NEXT FROM CURSOR_1 INTO @ColId;

END;
CLOSE CURSOR_1;
DEALLOCATE CURSOR_1;

Tuesday, February 27, 2018

SQL - Create Stored Procedures with optional parameters

If you want to make your code generic, say, sometimes you want to put certain criteria to the code to make it run faster,  and sometimes you just want to see all data no matter how long it will take. You can specify your declared parameter part like this:


ALTER PROCEDURE  dbo.sp_NAWAT_MISC_SpotfireInternalQC
(@state_name nvarchar(255) = NULL,
 @play_name  nvarchar(255) =  NULL
 )

 AS
BEGIN 

    ---- your code
           WHERE         (state_name = @state_name OR state_name is null)
                        AND  (play_name = @play_name OR  play_name is null) ;

END 
GO

or
(reference:SQL - Passing multiple values through one parameter in a stored procedure)











 AS
BEGIN 
  SET @Sql = '

    ---- your code
   



Thursday, February 1, 2018

Spotfire - Some properties in Expressions

Properties in Expressions


There are three types of properties available in TIBCO Spotfire: document properties, data table properties and column properties. All of these can be used in expressions that control one or many settings. The properties can be interpreted in two different ways, either as actual values or as text values that can be used instead of column names or measures.
Insert as Text
When you want to use a string property in an expression where it should be interpreted as a column name or a measure you must use a specific syntax. The name of the property should be enclosed in the following: “${“ and “}”. So, for a document property this could look something like: ${MyProperty}. For a data table property, the first part should be the data table name within the symbols specified above, followed by a period and the property name also within curly brackets: ${My Data Table}.{MyProperty}. For a column property, the column name is also required: ${My Data Table}.{Column Name}.{MyProperty}.









When the $ sign is encountered in an expression, the property will be replaced by its corresponding value before the actual expression is evaluated (a preprocessing step). This means that if you type an expression like: sum(${Property1}) -${Property1}-${Property2}, and the value of Property1 is the column name Column1, and the value of Property2 is Column2, then the resulting expression will be: sum([Column1]) -[Column1]-[Column2].
If you do not want the property value to be interpreted as a column name but as a literal string value, you should put the property call within quotation marks. For example, if you have a property called "MyProperty" with the value "MyValue", the expression ${MyProperty} would return the resulting expression MyValue, where a column called "MyValue" is retrieved. However, the expression "${MyProperty}" would return the text string "MyValue" instead. If the string value consists of several words (e.g., "My Value"), then you should use $esc() or put brackets around the property expression in order to return a column: [${MyProperty}]. See General Syntax for more information about column names.
Note that it is very important to use the correct syntax for the expression at all times. Otherwise you may end up with a different result than you expected. If a property is inserted as a column where the expression expected a string, the first value of the specified column will be retrieved. For string columns, the result may be a valid expression, but for a column of another data type, you may receive an error.
If a property is inserted using the Insert Properties button in the Insert Calculated Column dialog or in the Custom Expression dialog, the property will automatically be added using the text (preprocessor) syntax.











Wednesday, January 31, 2018

Spotfire - TERR dealing with date

 It is really hard for TERR to deal with date type in Spotfire. You cannot use as.Date() in TERR which works perfectly fine in R Studio.

Reference:
https://datashoptalk.com/convert-strings-date-time-terr/#comment-3303


Spotfire data functions recognize TERR objects of class “POSIXct” as date/time information. As designed, the Spotfire/TERR data function interface for date/time information does the following:
– Converts a Spotfire value or column whose DataType is “Date”, “Time” or “DateTime” into a TERR object of class “POSIXct”.
– Converts a TERR object of class “POSIXct” into a Spotfire value or column with a DataType of “DateTime”, which can then be formatted in Spotfire to display only the date (or to display only the time) if needed.
This interface does not use any other TERR object classes (such as the “Date” class in TERR) to transfer date/time information between Spotfire and TERR.

You can do this in R Studio, but this doesn't work in TERR:

A way around it is to use strptime()