Testing & AI/LLMs

Since the last 2+ years, there has been an unprecedented boom in application of artificial intelligence and its use in the field of testing has also increased exponentially.

However, I don’t see a lot of people who actually talk or write about how to test AI or the models ( also called Large Language Models – LLMs). In this page, I’ll try to list down articles/resources/videos/blog posts that I can find in this regards. So if you’re interested in this field, read down on this list.

Trustworthy Retrieval-Augmented Generation with the Trustworthy Language Model
RAG vs Finetuning LLMs – What to use, when, and why.
Regression Testing with LangSmith
Regression Testing with LangSmith – Video
Comprehensive Guide to reducing LLM Hallucinations
Possibilities of use of AI in Testing
RAG for Quality Engineers
Mitigating Hallucinations in LLMs by Enhancing Pre-training Data Quality
AI model for Browser Automation
https://github.com/dipjyotimetia/jarvis
Open source RAG with LLama2 and LangChain
Frameworks in Focus: ‘Building and Evaluating Advanced RAG’ with TruLens and LlamaIndex Insights
Weights and Biases in Machine Learning
Three phase LLM learning
Transformer Model
Supervised vs Unsupervised Learning
Machine Learning vs Deep Learning
PDF retrival using Llamaparse
Foundational Models
What are LLMs
What is an LLM – Cloudflare
LLM – Techopedia
What are LLMs – AWS Amazon
Large Language Model Training in 2024 – AIMultiple
What is LLM – Elastic
Masked Language Modelling
Building Product Knowledge Graph in DoorDash
RAG vs Large Context Models
What is a RAG
Building and evaluating a RAG with TruLens
RAG with BigQuery and LangChain
Diagnostic Tool for Deep Neural Networks
Evaluating Large Language Models: A Complete Guide
Evidently AI – Evaluation of LLM
LLM Evaluation Guide
Best LLM
Essential Idea of a Neural Network
Advancements in RAG Technologies and Techniques
GraphRAG
LLM Observability with Grafana
Master RAG in 5 hours
Using Evaluations to Optimize a RAG Pipeline: from Chunkings and Embeddings to LLMs
What We Still Don’t Understand About Machine Learning
The Tech Buffet #15: Build and Evaluate LLM Applications with TruLens
Using Giskard – an e2e evaluation framework for LLMs
Introduction to Giskard: Open-Source Quality Management for AI Models
Giskard : The testing framework for ML models, from tabular to LLMs.

TESTEROPS

A pragmatic approach to QA and OPS

Testing & AI/LLMs

Share this: