Wednesday, January 31, 2018

Spotfire - TERR dealing with date

 It is really hard for TERR to deal with date type in Spotfire. You cannot use as.Date() in TERR which works perfectly fine in R Studio.

Reference:
https://datashoptalk.com/convert-strings-date-time-terr/#comment-3303


Spotfire data functions recognize TERR objects of class “POSIXct” as date/time information. As designed, the Spotfire/TERR data function interface for date/time information does the following:
– Converts a Spotfire value or column whose DataType is “Date”, “Time” or “DateTime” into a TERR object of class “POSIXct”.
– Converts a TERR object of class “POSIXct” into a Spotfire value or column with a DataType of “DateTime”, which can then be formatted in Spotfire to display only the date (or to display only the time) if needed.
This interface does not use any other TERR object classes (such as the “Date” class in TERR) to transfer date/time information between Spotfire and TERR.

You can do this in R Studio, but this doesn't work in TERR:

A way around it is to use strptime()


R - scraping url and download csv from website


SQL - Call Stored Procedure in a view

Typically, you can not directly EXEC a store procedure in a view, but we can use OPENQUERY() to play the trick:



But, you do need data access on the server to run Openquery().

Monday, January 8, 2018

Spotfire - Unpivot JSON data to table

import clr
clr.AddReference('System.Data')
clr.AddReference('System.Web.Extensions')
import System
from System import DateTime
from System.Data import DataSet, DataTable
from System.IO import StreamReader, StreamWriter, MemoryStream, SeekOrigin
from System.Net import HttpWebRequest
from System.Web.Script.Serialization import JavaScriptSerializer
from Spotfire.Dxp.Data import DataType, DataTableSaveSettings
from Spotfire.Dxp.Data.Import import TextFileDataSource, TextDataReaderSettings

myJson = '''[
{"Column 1": 1,"API":"42013342780000","PRODUCTION":[{"MONTH": "2011-03-01","OIL_BOE":5657,"GAS_BOE":1},{"MONTH":"2011-04-01","OIL_BOE":6328,"GAS_BOE":1}]},
{"Column 1": 2,"API":"42013309790000","PRODUCTION":[{"MONTH": "2002-01-01","OIL_BOE": 3,"GAS_BOE":1}, {"MONTH": "2002-02-01","OIL_BOE": 4,"GAS_BOE":1}]},
{"Column 1": 3,"API":"42013309800000","PRODUCTION":[{"MONTH": "2002-01-01","OIL_BOE": 5,"GAS_BOE":1}, {"MONTH": "2002-02-01","OIL_BOE": 6,"GAS_BOE":1}, {"MONTH": "2002-03-01","OIL_BOE": 7,"GAS_BOE":1}]}
]'''


# Deserialize the JSON to a .net object
flowers = JavaScriptSerializer().DeserializeObject(myJson)

strData = ''
for f in flowers:
    for x in f['PRODUCTION']:
        strData += f['API'] + '\t' + str(x['MONTH']) + '\t' + str(x['OIL_BOE'])+ '\t' + str(x['GAS_BOE']) + '\r\n'

def generateDataTableFromString(strData, tblName, colHeaders, colTypes):
    # headers
    headers = '\t'.join(colHeaders) + '\r\n'

    # append data to header
    strData = headers + strData

    # make a stream from the string
    stream = MemoryStream()
    writer = StreamWriter(stream)
    writer.Write(strData)
    writer.Flush()
    stream.Seek(0, SeekOrigin.Begin)

    # set up the text data reader
    readerSettings = TextDataReaderSettings()
    readerSettings.Separator = "\t"
    readerSettings.AddColumnNameRow(0)

    # assign data type
    for i, col in enumerate(colTypes):
        readerSettings.SetDataType(i, col)

    # create a data source to read in the stream
    textDataSource = TextFileDataSource(stream, readerSettings)

    # add the data into a Data Table in Spotfire
    if Document.Data.Tables.Contains(tblName):
        Document.Data.Tables[tblName].ReplaceData(textDataSource)
    else:
        newTable = Document.Data.Tables.Add(tblName, textDataSource)
        tableSettings = DataTableSaveSettings(newTable, False, False)
        Document.Data.SaveSettings.DataTableSettings.Add(tableSettings)


# get parameters to build route details tables
tblName = 'well_data_1'
tblColumnsHeaders = ['API', 'MONTH','OIL_BOE','GAS_BOE']
tblColumnsTypes = [DataType.String , DataType.Date, DataType.Real  , DataType.Real] 

# generate tables
generateDataTableFromString(strData, tblName, tblColumnsHeaders, tblColumnsTypes)




 

Tuesday, January 2, 2018

SQL - BINARY_CHECKSUM() to detect row changes

Binary_checksum function is used to compute the checksum value of an entire row of a table or a list of expressions. This function is also used to detect the changes in a table.
Function usage:
  • Binary_checksum(*) is used to compute checksum value for the entire row.
  • Binary_checksum(expression1, 2,….,n) is used to compute checksum value for the given expression list.
Data types that are invalid to compute checksum value are  text, ntext, image, cursor, xml, and noncomparable common language runtime (CLR) user-defined types. So, we should not use these data types in the binary_checksum function.
This function returns an interger value. So, We can use this checksum value for the original row to compare with a current row to check whether there is any change in that particular row for a table.
Binary_checksum() and checksum() functions are have same functionality but the binary_checksum() returns different values for a string value with different case.


e.g.




Reference: checksum() vs. hashbytes()




R - Fill in data gaps function

fillNAgaps <- function(x, firstBack=FALSE) {
  ## NA's in a vector or factor are replaced with last non-NA values
  ## If firstBack is TRUE, it will fill in leading NA's with the first
  ## non-NA value. If FALSE, it will not change leading NA's.
 
  # If it's a factor, store the level labels and convert to integer
  lvls <- NULL
  if (is.factor(x)) {
    lvls <- levels(x)
    x    <- as.integer(x)
  }
 
  goodIdx <- !is.na(x)
 
  # These are the non-NA values from x only
  # Add a leading NA or take the first good value, depending on firstBack 
  if (firstBack)   goodVals <- c(x[goodIdx][1], x[goodIdx])
  else             goodVals <- c(NA,            x[goodIdx])
 
  # Fill the indices of the output vector with the indices pulled from
  # these offsets of goodVals. Add 1 to avoid indexing to zero.
  fillIdx <- cumsum(goodIdx)+1
 
  x <- goodVals[fillIdx]
 
  # If it was originally a factor, convert it back
  if (!is.null(lvls)) {
    x <- factor(x, levels=seq_along(lvls), labels=lvls)
  }
 
  x
}

colInput <-c(2,4,5,6,1,8,9,3,10,13,11,12)
bins <-5
colInput<-sort(colInput)

colOutput <- c()
n<- seq(1,length(colInput),by = length(colInput)/bins)

for (i in 1:length(n)){
  if(i <= n[i]){
    colOutput[ceiling(n[i]):ceiling(n[i]+1)] <-  paste("group " ,toString(i))
    if(is.na(colOutput[floor(n[i])])){colOutput[floor(n[i])]<-paste("group " ,toString(i))}
  }
}

colInput<-sort(colInput)

colOutput<-colOutput[1:length(colInput)]

colOutput<-fillNAgaps(colOutput)

tblNewGroup <- cbind(colInput ,colOutput)