Wednesday, July 15, 2020

Python / S3 - Functions to list keys in an S3 bucket using Python


from glob import glob
import boto3


class Versions:
    def __init__(self):
        """Gets the latest version from local or s3"""        
         pass
    def get_latest_version_from_local(path):
        """Gets the latest version from Local"""        
        versions_paths = glob((path + "/*"), recursive=True)
        versions = []
        for i in enumerate(versions_paths):
            split_path = i[1].rstrip('/').split("/")
            version = split_path.pop()
            versions.append(version)
        versions.sort(reverse=True)
        return versions[0]
    def get_latest_version_from_s3(bucket_name, path):
        """Gets the latest version from s3 """       
        key = path.rstrip('/').split("/").pop()
        s3 = boto3.client('s3')
        response = s3.list_objects_v2(
            Bucket=bucket_name,            
            Prefix=key,            
            MaxKeys=100)
        versions = []
        for obj in response['Contents']:
            split_path = obj['Key'].rstrip('/').split("/")
            versions.append(split_path[1])
        versions.sort(reverse=True)
        return versions[0]
 
    def get_all_s3_keys(bucket_name):
        """Get a list of all keys in an S3 bucket."""        
        versions = []

        kwargs = {'Bucket': bucket_name}
        s3 = boto3.client('s3')
        while True:
            resp = s3.list_objects_v2(**kwargs)
            for obj in resp['Contents']:
                if 'well_production' in str(obj['Key']) 
                    and '$folder$' not in str(obj['Key']):
                    versions.append(obj['Key'])
            try:
                kwargs['ContinuationToken'] = resp['NextContinuationToken']
            except KeyError:
                break        
        versions.sort(reverse=True)
        return versions[0]

Thursday, July 9, 2020

SQL - table-valued function to get all months between two date range


CREATE FUNCTION [dbo].[GetMonths](@StartDate DATETIME, @EndDate DATETIME)

RETURNS @MonthList TABLE(MonthValue VARCHAR(15) NOT NULL)

AS

BEGIN

    --Variable used to hold each new date value

    DECLARE @DateValue DATETIME

    --Start with the starting date in the range

    SET @DateValue=@StartDate

    --Load output table with the month part of each new date

    WHILE @DateValue <= @EndDate

    BEGIN

        INSERT INTO @MonthList(MonthValue)

        SELECT cast(@DateValue as date)

        --Move to the next month

        SET @DateValue=DATEADD(mm,1,@DateValue)

    END

    RETURN 

END

Monday, July 6, 2020

SQL - Track where a Stored Procedure is being used


 SELECT o.name
 FROM syscomments AS c
 INNER JOIN sysobjects AS o
 ON c.id = o.id
 WHERE c.text LIKE '%stored_procedure_name%';

Wednesday, July 1, 2020

Regular Expression - good practice and a good place to check


A good place to check if you regular expression does the work is https://regex101.com/

e.g. 




I have a list of company list extracted from some website's filter:

By using regular expression, I can easily extract the company number in two lines:


regex = re.compile(r'.*\((\d{5})\)')
company_value_list = [regex.match(item.text).group(1) for item in company_list
                      if re.match(regex, item.text) is not None]


So the result is a list of company numbers:

['39227', '65860', '39639', '68942', '68979', '68998', '68938', '62950'.....]