Skip to main content

Upstage

Upstage is a leading artificial intelligence (AI) company specializing in delivering above-human-grade performance LLM components.

Solar LLM

Solar Mini Chat is a fast yet powerful advanced large language model focusing on English and Korean. It has been specifically fine-tuned for multi-turn chat purposes, showing enhanced performance across a wide range of natural language processing tasks, like multi-turn conversation or tasks that require an understanding of long contexts, such as RAG (Retrieval-Augmented Generation), compared to other models of a similar size. This fine-tuning equips it with the ability to handle longer conversations more effectively, making it particularly adept for interactive applications.

Other than Solar, Upstage also offers features for real-world RAG (retrieval-augmented generation), such as Groundedness Check and Layout Analysis.

Installation and Setup

Install langchain-upstage package:

pip install -qU langchain-core langchain-upstage

Get API Keys and set environment variables UPSTAGE_API_KEY and UPSTAGE_DOCUMENT_AI_API_KEY.

As of April 2024, you need separate API Keys for Solar and Document AI(Layout Analysis). The API Keys will be consolidated soon (hopefully in May) and you’ll need just one key for all features.

Upstage LangChain integrations

APIDescriptionImportExample usage
ChatBuild assistants using Solar Mini Chatfrom langchain_upstage import ChatUpstageGo
Text EmbeddingEmbed strings to vectorsfrom langchain_upstage import UpstageEmbeddingsGo
Groundedness CheckVerify groundedness of assistant’s responsefrom langchain_upstage import UpstageGroundednessCheckGo
Layout AnalysisSerialize documents with tables and figuresfrom langchain_upstage import UpstageLayoutAnalysisLoaderGo

See documentations for more details about the features.

Quick Examples

Environment Setup

import os

os.environ["UPSTAGE_API_KEY"] = "YOUR_API_KEY"
os.environ["UPSTAGE_DOCUMENT_AI_API_KEY"] = "YOUR_DOCUMENT_AI_API_KEY"

Chat

from langchain_upstage import ChatUpstage

chat = ChatUpstage()
response = chat.invoke("Hello, how are you?")
print(response)

Text embedding

from langchain_upstage import UpstageEmbeddings

embeddings = UpstageEmbeddings()
doc_result = embeddings.embed_documents(
["Sam is a teacher.", "This is another document"]
)
print(doc_result)

query_result = embeddings.embed_query("What does Sam do?")
print(query_result)

Groundedness Check

from langchain_upstage import UpstageGroundednessCheck

groundedness_check = UpstageGroundednessCheck()

request_input = {
"context": "Mauna Kea is an inactive volcano on the island of Hawaii. Its peak is 4,207.3 m above sea level, making it the highest point in Hawaii and second-highest peak of an island on Earth.",
"answer": "Mauna Kea is 5,207.3 meters tall.",
}
response = groundedness_check.invoke(request_input)
print(response)

Layout Analysis

from langchain_upstage import UpstageLayoutAnalysisLoader

file_path = "/PATH/TO/YOUR/FILE.pdf"
layzer = UpstageLayoutAnalysisLoader(file_path, split="page")

# For improved memory efficiency, consider using the lazy_load method to load documents page by page.
docs = layzer.load() # or layzer.lazy_load()

for doc in docs[:3]:
print(doc)

Help us out by providing feedback on this documentation page: