mrbusche.com

Updating LlamaIndex to version 0.10

by Matt Busche on February 17, 2024

With the release of LlamaIndex v0.10 imports have changed from top level llama_index package to llama_index.core, llama_index.embeddings, and llama_index.llms

ServiceContext has also been deprecated and replaced with Settings. A concise version of existing code is below

from llama_index import ServiceContext
from llama_index.embeddings import AzureOpenAIEmbedding
from llama_index.evaluation import FaithfulnessEvaluator, RelevancyEvaluator
from llama_index.llms import AzureOpenAI

def evaluate_llama(dataset):
    llm = AzureOpenAI()
    embed_model = AzureOpenAIEmbedding()
    service_context = ServiceContext.from_defaults(llm=llm, embed_model=embed_model)

    faithfulness_gpt4 = FaithfulnessEvaluator(service_context=service_context)
    relevancy_gpt4 = RelevancyEvaluator(service_context=service_context)

    from llama_index.evaluation import BatchEvalRunner

Updated code removes creating and passing ServiceContext around with the new Settings object, which also reduces passing around llmb and embed_model as well. This part is all straightforward, but the migration tool does not take into account needing to add some new packages to requirements.txt

pip install llama_index_core llama-index-embeddings-azure-openai llama-index-llms-azure-openai

Once you’ve installed new packages, you should be able to update your imports. A concise version of the changes is listed below.

from llama_index.core import Settings
from llama_index.core.evaluation import FaithfulnessEvaluator, RelevancyEvaluator
from llama_index.embeddings.azure_openai import AzureOpenAIEmbedding
from llama_index.llms.azure_openai import AzureOpenAI

def evaluate_llama(dataset):
    Settings.llm = AzureOpenAI()
    Settings.embed_model = AzureOpenAIEmbedding()

    faithfulness_gpt4 = FaithfulnessEvaluator()
    relevancy_gpt4 = RelevancyEvaluator()

    from llama_index.core.evaluation import BatchEvalRunner

Squash all commits on a git branch

by Matt Busche on February 5, 2024

To squash all git commits on a branch you can run

git reset $(git merge-base master $(git branch --show-current))

There are other required steps, such as ensuring you’re up to date from main, but the gist if what you need is the singular command above

Resolving glibc errors with python module

by Matt Busche on October 18, 2023

We recently switched out our lambda build image to a debian based image and started receiving errors around glibc.

[ERROR] Runtime.ImportModuleError. Unable to import module 'app':
/lib64/lib.so.6: version 'GLIBC_2.28' not found
(required by /var/task/cryptography/hazmat/bidnings/_rust.abi3.so)

After some googling we realized pip chooses the correct wheel for us and since we were running pip on a different machine than we were running our Python program on, we needed to let pip know about this.

RHEL/CentOS are using manylinux2014 which is what we need to pass to pip

--platform manylinux2014_x86_64

Additionally we do not want to use source packages, so we had to pass

 --only-binary=:all:

Our final command ended up being

python3 -m pip install --platform manylinux2014_x86_64 --only-binary=:all: -r requirements.txt

Using Spring JPA with tables names with spaces, periods, and other special characters

by Matt Busche on October 5, 2023

Given a non-traditional table name, how do you get Spring JPA to recognize your @Entity properly?

If your table name has a period such as odd.table You use @Table(name="[odd].[table]")

If your table name has a slash such as odd/table You use @Table(name="[odd/table]")

If your table name has spaces such as table with spaces You use @Table(name="[table with spaces]")

TL;DR - [] are your friend.

Converting a JSON file to a key and value list using jq

by Matt Busche on March 14, 2023

Given a JSON file named data.json

{
  "name": "Matt",
  "job": "Engineer"
}

You can output the keys and values using the following

jq -r 'to_entries|map("\(.key)=\(.value|tostring)")|.[]' data.json > file.txt

file.txt contains

name=Matt
job=Engineer

You can upper case the key, by piping ascii_upcase to .key

jq -r 'to_entries|map("\(.key|ascii_upcase)=\(.value|tostring)")|.[]' data.json > file.txt

file.txt now contains

NAME=Matt
JOB=Engineer

You can also prepend text to the keys as well, here we’ll prepend WOW_ to each key

jq -r 'to_entries|map("WOW_\(.key|ascii_upcase)=\(.value|tostring)")|.[]' data.json > file.txt

file.txt now contains

WOW_NAME=Matt
WOW_JOB=Engineer

External app config with React, Vite, AWS CloudFront, S3, and Secrets Manager

by Matt Busche on March 14, 2023

Putting secrets in your git repo is a no no, learn how to accomplish this using React, S3, and AWS Secrets Manager

Create a secret (you can also do this manually through the UI)

rSecret:
    Type: AWS::SecretsManager::Secret
    Properties:
        Description: Secrets used to create application configuration
        Name: !Sub '${pProduct}'
        SecretString: '{}'
        Tags:
        - Key: Environment
            Value: !Ref pEnvironment

Outputs:
  oSecrets:
    Value: !Ref rMyAppConfig

Add your secrets using Secrets Manager in your AWS account

Adding secrets to AWS Secrets Manager

Create a CodePipeline step to deploy your application to S3

- Name: Build_and_Deploy_To_S3
  ActionTypeId:
    Category: Build
    Owner: AWS
    Provider: CodeBuild
    Version: '1'
  Configuration:
    ProjectName: !Sub ${pProduct}-${pBusinessUnit}-S3Upload-${AWS::Region}
    # The env variables are necessary to retrieve the secret id, you can omit if you'd like to hard code it
    EnvironmentVariables: !Sub '[{"name":"S3_BUCKETS_ARTIFACT_VAR", "value":"CODEBUILD_SRC_DIR_${pBusinessUnit}S3", "type":"PLAINTEXT"}, {"name":"S3_BUCKETS_ARTIFACT_FILE", "value":"${pBusinessUnit}S3Buckets.json", "type":"PLAINTEXT"}]'
    PrimarySource: Source
  InputArtifacts:
    - Name: Source

Update your deploy to S3 step to pull in the secrets

profile=your-profile

npm ci
viteFilename=.env.production #.env.production is pulled in by default vite build, your name may vary depending on what vite build command you're running

# Pull the secret you created earlier from secrets manager and output as json file
appConfigSecret=$(jq .oSecrets ${!S3_BUCKETS_ARTIFACT_VAR}/${S3_BUCKETS_ARTIFACT_FILE} -r)
aws secretsmanager get-secret-value --secret-id ${appConfigSecret} --query SecretString --profile ${profile} | jq -r > secrets.json

# use jq to update your secrets from json to VITE_SECRET=secret-value
jq -r 'to_entries|map("VITE_\(.key|ascii_upcase)=\(.value|tostring)")|.[]' secrets.json > ${viteFilename}

# Run your build, it is very important you run your build after the secret is already on the file system, otherwise your application will not have access to the secrets
vite build

# copy your application files to s3
s3BucketPath=s3://your-bucket-path
aws s3 rm ${s3BucketPath} --recursive --profile ${profile} --quiet
aws s3 cp ./dist/ ${s3BucketPath} --recursive --sse AES256 --profile ${profile} --quiet

Finally, add a script to your package.json file to allow new developers to download the .env file from your S3 bucket to your local file system

"config": "npx path-exists-cli .env && echo 'exists' || aws s3 cp s3://insert-your-bucket-path/.env.production ./.env",

The beautiful thng about React plus vite is this file isn’t exposed on your file system anywhere

Cache bust JavaScript, CSS or other file in Dockerfile

by Matt Busche on February 15, 2023

If you have an application without a build system, but need to cache bust a js file, this will do the trick

FROM nginx:1

COPY --chown=nginx:nginx html/ /usr/share/nginx/html

EXPOSE 8080

# Cache bust js file by appending date to scan.js file
RUN sed -i "s/scan.js/scan.js?a=$(date '+%Y%m%d%H%M')/g" /usr/share/nginx/html/index.html

ENTRYPOINT ["nginx", "-g", "daemon off;"]

Unzip Docker image and contents

by Matt Busche on February 15, 2023

If you ever need to see the files inside a docker image, you can save the image locally and then unzip all the contents.

image_tag=repository:tag

docker save ${image_tag} > image.tar
tar xf image.tar
rm image.tar

for f in */; do
  if [ -d "${f}" ]; then
    cd "${f}" ||
        # unzip each of the layers
        find ./ -type f -name "*.tar" -exec tar xf "{}" \;
    cd ../
  fi
done

AWS CloudFront create redirect using CloudFormation

by Matt Busche on December 14, 2022

When decommissioning a website, it’s ideal to set up a permanent redirect for the current domain, so users aren’t left in the dark. Below is code to redirect a user from an existing CloudFront distribution to a new URL.

You can use any statusCode, but in this instance a 301 is appropriate because this is a permanent redirect.

# The Distribution should already exist. We just need to add the FunctionAssociations
Resources:
  rDistribution:
    Type: AWS::CloudFront::Distribution # https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-cloudfront-distribution-distributionconfig.html
    Properties:
      DistributionConfig:
        Enabled: True
        DefaultRootObject: index.html
        DefaultCacheBehavior:
          TargetOriginId: BuscheOrigin
          ViewerProtocolPolicy: redirect-to-https
        ### new code ###
        FunctionAssociations:
          - EventType: viewer-request
            FunctionARN: !GetAtt BuscheRedirectFunction.FunctionMetadata.FunctionARN #name needs to match redirect function
        ### end new code ###
        HttpVersion: http2

### new function ###
BuscheRedirectFunction: # https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-cloudfront-function.html
    Type: AWS::CloudFront::Function
    Properties:
      Name: "busche-redirect"
      AutoPublish: true
      FunctionCode: |
        function handler(event) {
          return {
            statusCode: 301,
            statusDescription: 'Found',
            headers: {
              'cloudfront-functions': { value: 'generated-by-CloudFront-Functions' },
              'location': { value: 'https://mrbusche.com' }
            }
          };
        }
      FunctionConfig:
        Comment: rewrite requests from busche to mrbusche.com
        Runtime: cloudfront-js-1.0
### end new function ###

Concourse push commit to pull request branch

by Matt Busche on December 21, 2021

We run prettier against our repository, but rather than fail a pull-request check if users don’t run it locally, we wanted to update the branch, push a commit, and run the other checks. Here’s what we were able to get working

resource_types:
  - name: pull-request
    type: registry-image
    source:
      repository: teliaoss/github-pr-resource

resources:
  - name: pull-request
    type: pull-request
    icon: source-pull
    check_every: 8760h
    webhook_token: token
    public: true
    source:
      repository: github.com/mrbusche/your-repository
      access_token: ((git-access-token))
      v3_endpoint: https://github.com/api/v3/
      v4_endpoint: https://github.com/api/graphql

jobs:
  - name: test-pull-request
    build_log_retention:
      builds: 30
      days: 30
      minimum_succeeded_builds: 10
    public: true
    plan:
    - get: pull-request
      trigger: true
      version: every
    - put: pull-request
      params:
        path: pull-request
        context: UI Tests
        status: pending
    - task: prettier
      config:
        platform: linux
        image_resource:
          type: docker-image
          source:
            repository: image-with-node-git-and-jq
        inputs:
        - name: pull-request
        # this makes pull-request available to subsequent tasks
        outputs:
          - name: pull-request
        params:
          USERNAME: ((git-username))
          ACCESS_TOKEN: ((git-access-token))
        run:
          path: sh
          dir: pull-request
          args:
          - -exc
          - |

            prNumber=$(cat .git/resource/metadata.json | jq -r '.[] | select(.name=="pr") | .value')
            branchName=$(curl 'https://'${USERNAME}:${ACCESS_TOKEN}'@github.com/api/v3/repos/mrbusche/your-repository/pulls/'${prNumber} | jq -r '.head.ref')

            git fetch
            git checkout ${branchName}

            # find last committer email and username, concourse-ci by default
            commitUserId=$(curl 'https://'${USERNAME}:${ACCESS_TOKEN}'@github.com/api/v3/repos/mrbusche/your-repository/pulls/'${prNumber} | jq -r '.user.login')
            git config user.email ${commitUserId}
            git config user.name "$(git log -1 --pretty=format:'%an')"

            # run some command that modifies files
            npm install -g prettier
            npm run prettier:fix

            git add .

            # if you try to commit and push without changes the task fails
            if [[ ! -z "$(git status --porcelain)" ]]; then
              git commit -m "prettier fix"

              git push --set-upstream origin ${branchName}
            else
              echo "no changes found"
            fi

    - task: test-ui
      config:
        platform: linux
        image_resource:
        type: registry-image
        source:
          repository: timbru31/node-chrome
          tag: 14-slim
        inputs:
        - name: pull-request
        run:
        path: /bin/sh
        args:
          - -exc
          - |
          npm config set cache $(pwd)/.npm --global
          cd pull-request
          export NG_CLI_ANALYTICS=false
          CYPRESS_INSTALL_BINARY=0 npm ci --quiet
          npm run test:headless
        caches:
        - path: .npm
        - path: pull-request/node_modules
      on_success:
        put: pull-request
        params:
        path: pull-request
        context: UI Tests
        status: success
      on_failure:
        put: pull-request
        params:
        path: pull-request
        context: UI Tests
        status: failure
    - put: pull-request
      params:
        path: pull-request
        status: success