Tuesday, October 29, 2024

Sharepoint - Excel to S3 automation

Microsoft Graph is a technology platform that allows organizations to connect their business systems with Microsoft 365 SaaS data and services. Microsoft Graph includes all the options available in modern cloud integration platforms, including interactive APIs, data ingress and egress capabilities, and event-based integration.

The best choice for this application is to use Microsoft Graph RESTful HTTP APIs. The client app responds to user actions and can make requests and process the data at a speed that's controlled by the client environment.( )

Steps to let Azure application get access to Sharepoint folder

  1. Request 'Sites.Selected' from Entra ID Admin Portal
    1. To be requested by: - You
    2. To be consented by: - Entra ID Administrator.

  2. Grant your application, permissions on the site collection

  3. Use Python SDK within your application

import asyncio
import requests
from azure.identity import ClientSecretCredential
from msgraph import GraphServiceClient
import nest_asyncio

# pip install msgraph-sdk azure-identity nest_asyncio


nest_asyncio.apply()

client_secret = 'xxxxxxx' # graphapi
tenant_id = 'xxxxxxx'
client_id = 'xxxxxx'
site_id = 'xxxxxx'
drive_id = 'xxxxxx' # SharePoint document library is a drive in Graph
parent_folder_id = 'xxxxxx'
## INPUT FROM USER ##
folder_name = 'xxxxx'
file_name = 'xxxxxx.xlsx'

# Create a credential object. Used to authenticate requests
credential = ClientSecretCredential(
tenant_id=tenant_id,
client_id=client_id,
client_secret=client_secret)

scopes = ['https://graph.microsoft.com/.default']
# Create an API client with the credentials and scopes.
client = GraphServiceClient(credentials=credential, scopes=scopes)


# Get folder_id from MFG Data Owners folder
async def get_folder_id(folder_name):
item = await client.drives.by_drive_id(drive_id).items.by_drive_item_id(parent_folder_id).children.get()
if item and item.value:
for item in item.value:
if item.name == folder_name:
return item.id


# Get Excel file from sub folder
async def get_drive_items(folder_id):
item = await client.drives.by_drive_id(drive_id).items.by_drive_item_id(folder_id).children.get()
if item and item.value:
for item in item.value:
if item.name == file_name:
print(item.name)
print(item.additional_data)
updated_file_name = item.name.replace(" ", "_")
download_url = item.additional_data['@microsoft.graph.downloadUrl']
resp = requests.get(download_url)

output = open(f'{updated_file_name}', 'wb')
output.write(resp.content)
output.close()


folder_id = asyncio.run(get_folder_id(folder_name))
asyncio.run(get_drive_items(folder_id))