Nan's Blog: January 2018

Wednesday, January 31, 2018

Spotfire - TERR dealing with date

It is really hard for TERR to deal with date type in Spotfire. You cannot use as.Date() in TERR which works perfectly fine in R Studio.

Reference:
https://datashoptalk.com/convert-strings-date-time-terr/#comment-3303

Spotfire data functions recognize TERR objects of class “POSIXct” as date/time information. As designed, the Spotfire/TERR data function interface for date/time information does the following:

– Converts a Spotfire value or column whose DataType is “Date”, “Time” or “DateTime” into a TERR object of class “POSIXct”.

– Converts a TERR object of class “POSIXct” into a Spotfire value or column with a DataType of “DateTime”, which can then be formatted in Spotfire to display only the date (or to display only the time) if needed.

This interface does not use any other TERR object classes (such as the “Date” class in TERR) to transfer date/time information between Spotfire and TERR.

You can do this in R Studio, but this doesn't work in TERR:

A way around it is to use strptime()

R - scraping url and download csv from website

SQL - Call Stored Procedure in a view

Typically, you can not directly EXEC a store procedure in a view, but we can use OPENQUERY() to play the trick:

But, you do need data access on the server to run Openquery().

Monday, January 8, 2018

Spotfire - Unpivot JSON data to table

import clr
clr.AddReference('System.Data')
clr.AddReference('System.Web.Extensions')
import System
from System import DateTime
from System.Data import DataSet, DataTable
from System.IO import StreamReader, StreamWriter, MemoryStream, SeekOrigin
from System.Net import HttpWebRequest
from System.Web.Script.Serialization import JavaScriptSerializer
from Spotfire.Dxp.Data import DataType, DataTableSaveSettings
from Spotfire.Dxp.Data.Import import TextFileDataSource, TextDataReaderSettings

myJson = '''[
{"Column 1": 1,"API":"42013342780000","PRODUCTION":[{"MONTH": "2011-03-01","OIL_BOE":5657,"GAS_BOE":1},{"MONTH":"2011-04-01","OIL_BOE":6328,"GAS_BOE":1}]},
{"Column 1": 2,"API":"42013309790000","PRODUCTION":[{"MONTH": "2002-01-01","OIL_BOE": 3,"GAS_BOE":1}, {"MONTH": "2002-02-01","OIL_BOE": 4,"GAS_BOE":1}]},
{"Column 1": 3,"API":"42013309800000","PRODUCTION":[{"MONTH": "2002-01-01","OIL_BOE": 5,"GAS_BOE":1}, {"MONTH": "2002-02-01","OIL_BOE": 6,"GAS_BOE":1}, {"MONTH": "2002-03-01","OIL_BOE": 7,"GAS_BOE":1}]}
]'''

# Deserialize the JSON to a .net object
flowers = JavaScriptSerializer().DeserializeObject(myJson)

strData = ''
for f in flowers:
for x in f['PRODUCTION']:
strData += f['API'] + '\t' + str(x['MONTH']) + '\t' + str(x['OIL_BOE'])+ '\t' + str(x['GAS_BOE']) + '\r\n'

def generateDataTableFromString(strData, tblName, colHeaders, colTypes):
# headers
headers = '\t'.join(colHeaders) + '\r\n'

# append data to header
strData = headers + strData

# make a stream from the string
stream = MemoryStream()
writer = StreamWriter(stream)
writer.Write(strData)
writer.Flush()
stream.Seek(0, SeekOrigin.Begin)

# set up the text data reader
readerSettings = TextDataReaderSettings()
readerSettings.Separator = "\t"
readerSettings.AddColumnNameRow(0)

# assign data type
for i, col in enumerate(colTypes):
readerSettings.SetDataType(i, col)

# create a data source to read in the stream
textDataSource = TextFileDataSource(stream, readerSettings)

# add the data into a Data Table in Spotfire
if Document.Data.Tables.Contains(tblName):
Document.Data.Tables[tblName].ReplaceData(textDataSource)
else:
newTable = Document.Data.Tables.Add(tblName, textDataSource)
tableSettings = DataTableSaveSettings(newTable, False, False)
Document.Data.SaveSettings.DataTableSettings.Add(tableSettings)

# get parameters to build route details tables
tblName = 'well_data_1'
tblColumnsHeaders = ['API', 'MONTH','OIL_BOE','GAS_BOE']
tblColumnsTypes = [DataType.String , DataType.Date, DataType.Real , DataType.Real]

# generate tables
generateDataTableFromString(strData, tblName, tblColumnsHeaders, tblColumnsTypes)

Friday, January 5, 2018

SQL - Merge data into one row if ID are the same

Original dataset: Desired dataset:

This can be used to change data to JSON format:

Or:

Tuesday, January 2, 2018

SQL - BINARY_CHECKSUM() to detect row changes

Binary_checksum function is used to compute the checksum value of an entire row of a table or a list of expressions. This function is also used to detect the changes in a table.

Function usage:

Binary_checksum(*) is used to compute checksum value for the entire row.
Binary_checksum(expression1, 2,….,n) is used to compute checksum value for the given expression list.

Data types that are invalid to compute checksum value are text, ntext, image, cursor, xml, and noncomparable common language runtime (CLR) user-defined types. So, we should not use these data types in the binary_checksum function.

This function returns an interger value. So, We can use this checksum value for the original row to compare with a current row to check whether there is any change in that particular row for a table.

Binary_checksum() and checksum() functions are have same functionality but the binary_checksum() returns different values for a string value with different case.

e.g.

Reference: checksum() vs. hashbytes()

http://craftydba.com/?p=3005

http://www.thesqlgoth.com/2015/05/checksum-vs-hashbytes.html

R - Fill in data gaps function

fillNAgaps <- function(x, firstBack=FALSE) {
## NA's in a vector or factor are replaced with last non-NA values
## If firstBack is TRUE, it will fill in leading NA's with the first
## non-NA value. If FALSE, it will not change leading NA's.

# If it's a factor, store the level labels and convert to integer
lvls <- NULL
if (is.factor(x)) {
lvls <- levels(x)
x <- as.integer(x)
}

goodIdx <- !is.na(x)

# These are the non-NA values from x only
# Add a leading NA or take the first good value, depending on firstBack
if (firstBack) goodVals <- c(x[goodIdx][1], x[goodIdx])
else goodVals <- c(NA, x[goodIdx])

# Fill the indices of the output vector with the indices pulled from
# these offsets of goodVals. Add 1 to avoid indexing to zero.
fillIdx <- cumsum(goodIdx)+1

x <- goodVals[fillIdx]

# If it was originally a factor, convert it back
if (!is.null(lvls)) {
x <- factor(x, levels=seq_along(lvls), labels=lvls)
}

x
}

colInput <-c(2,4,5,6,1,8,9,3,10,13,11,12)
bins <-5
colInput<-sort(colInput)

colOutput <- c()
n<- seq(1,length(colInput),by = length(colInput)/bins)

for (i in 1:length(n)){
if(i <= n[i]){
colOutput[ceiling(n[i]):ceiling(n[i]+1)] <- paste("group " ,toString(i))
if(is.na(colOutput[floor(n[i])])){colOutput[floor(n[i])]<-paste("group " ,toString(i))}
}
}

colInput<-sort(colInput)

colOutput<-colOutput[1:length(colInput)]

colOutput<-fillNAgaps(colOutput)

tblNewGroup <- cbind(colInput ,colOutput)

Nan's Blog