# Llmware > description: community resources, getting help and sharing ideas --- --- layout: default title: Community nav_order: 6 has_children: true description: community resources, getting help and sharing ideas permalink: /community --- # Community Welcome to the llmware community! We are on a mission to pioneer the use of small language models as transformational tools in the enterprise to automate workflows and knowledge-based processes cost-effectively, securely and with high quality. We believe that the secret is increasing out that small models can be extremely effective, but require a lot of attention to detail in building scalable data pipelines and fine-tuning both models and end-to-end workflows. We are open to both the most advanced machine learning researchers and the beginning developer just learning python. We publish a wide range of examples, use cases and tutorial videos, and are always looking for feedback, new ideas and contributors. {: .note} > The contributions to `llmware` are governed by our [Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md). {: .warning} > Have you found a security issue? Then please jump to [Security Vulnerabilities](#security-vulnerabilities). On this page, we provide information ``llmware`` contributions. There are **two ways** on how you can contribute. The first is by making **code contributions**, and the second by making contributions to the **documentation**. Please look at our [contribution suggestions](#how-can-you-contribute) if you need inspiration, or take a look at [open issues](#open-issues). Contributions to `llmware` are welcome from everyone. Our goal is to make the process simple, transparent, and straightforward. We are happy to receive suggestions on how the process can be improved. ## How can you contribute? {: .note} > If you have never contributed before look for issues with the tag [``good first issue``](https://github.com/llmware-ai/llmware/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22). The most usual ways to contribute is to add new features, fix bugs, add tests, or add documentation. You can visit the [issues](https://github.com/llmware-ai/llmware/issues) site of the project and search for tags such as ``bug``, ``enhancement``, ``documentation``, or ``test``. Here is a non exhaustive list of contributions you can make. 1. Code refactoring 2. Add new text data bases 3. Add new vector data bases 4. Fix bugs 5. Add usage examples (see for example the issues [jupyter notebook - more examples and better support](https://github.com/llmware-ai/llmware/issues/508) and [google colab examples and start up scripts](https://github.com/llmware-ai/llmware/issues/507)) 6. Add experimental features 7. Improve code quality 8. Improve documentation in the docs (what you are reading right now) 9. Improve documentation by adding or updating docstrings in modules, classes, methods, or functions (see for example [Add docstrings](https://github.com/llmware-ai/llmware/issues/219)) 10. Improve test coverage 11. Answer questions in our [Discord channel](https://discord.gg/MhZn5Nc39h), especially in the [technical support forum](https://discord.com/channels/1179245642770559067/1218498778915672194) 12. Post projects in which you use ``llmware`` in our Discord forum [made with llmware](https://discord.com/channels/1179245642770559067/1218567269471486012), ideally with a link to a public GitHub repository ## Open Issues If you're interested in existing issues, you can - Look for issues, if you are a new to the project, look for issues with the `good first issue` label. - Provide answers for questions in our [GitHub discussions](https://github.com/llmware-ai/llmware/discussions) - Provide help for bug or enhancement issues. - Ask questions, reproduce the issues, or provide solutions. - Pull a request to fix the issue. ## Security Vulnerabilities **If you believe you've found a security vulnerability, then please _do not_ submit an issue ticket or pull request or otherwise publicly disclose the issue.** Please follow the process at [Reporting a Vulnerability](https://github.com/llmware-ai/llmware/blob/main/Security.md) ## GitHub workflow We follow the [``fork-and-pull``](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request-from-a-fork) Git workflow. 1. [Fork](https://docs.github.com/en/github/getting-started-with-github/fork-a-repo) the repository on GitHub. 2. Clone your fork to your local machine with `git clone git@github.com:/llmware.git`. 3. Create a branch with `git checkout -b my-topic-branch`. 4. Run the test suite by navigating to the tests/ folder and running ```./run-tests.py -s``` to ensure there are no failures 5. [Commit](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/committing-changes-to-a-pull-request-branch-created-from-a-fork) changes to your own branch, then push to GitHub with `git push origin my-topic-branch`. 6. Submit a [pull request](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests) so that we can review your changes. Remember to [synchronize your forked repository](https://docs.github.com/en/github/getting-started-with-github/fork-a-repo#keep-your-fork-synced) _before_ submitting proposed changes upstream. If you have an existing local repository, please update it before you start, to minimize the chance of merge conflicts. ```shell git remote add upstream git@github.com:llmware-ai/llmware.git git fetch upstream git checkout upstream/main -b my-topic-branch ``` ## Community Questions and discussions are welcome in any shape or form. Please fell free to join our community on our discord channel, on which we are active daily. You are also welcome if you just want to post an idea! - [Discord Channel](https://discord.gg/MhZn5Nc39h) - [GitHub discussions](https://github.com/llmware-ai/llmware/discussions) --- --- layout: default title: FAQ parent: Community nav_order: 1 description: overview of the major modules and classes of LLMWare permalink: /community/faq --- # Frequently Asked Questions (FAQ) ### How can I set the chunk size? #### "I want to parse my documents into smaller chunks" You can set the chunk size with the ``chunk_size`` parameter of the ``add_files`` method. The ``add_files`` method from the ``Library`` class has a ``chunk_size`` parameter that controls the chunk size. The method in addition has a parameter to control the maximum chunk size with ``max_chunk_size``. These two parameters are passed on to the ``Parser`` class. In the following example, we add the same files with different chunk sizes to the library ``chunk_size_example``. ```python from pathlib import Path from llmware.library import Library path_to_my_library_files = Path('~/llmware_data/sample_files/Agreements') my_library = Library().create_new_library(library_name='chunk_size_example') my_library.add_files(input_folder_path=path_to_my_library_files, chunk_size=400) my_library.add_files(input_folder_path=path_to_my_library_files, chunk_size=600) ``` ### How can I set the embedding store? #### "I want to use a specific embedding store" You can set the embedding store with the ``vector_db`` parameter of the ``install_new_embedding`` method, which you call on a ``Library`` object each time you want to create an embedding for a *library*. The ``install_new_embedding`` method from the ``Library`` class has a ``vector_db`` parameter that sets the embedding store. At the moment of this writing, *LLMWare* supports the embedding stores [chromadb](https://github.com/chroma-core/chroma), [neo4j](https://github.com/neo4j/neo4j), [milvus](https://github.com/milvus-io/milvus), [pg_vector](https://github.com/pgvector/pgvector), [postgres](https://github.com/postgres/postgres), [redis](https://github.com/redis/redis), [pinecone](https://www.pinecone.io/), [faiss](https://github.com/facebookresearch/faiss), [qdrant](https://github.com/qdrant/qdrant), [mongo atlas](https://www.mongodb.com/products/platform/atlas-database), and [lancedb](https://github.com/lancedb/lancedb). In the following example, we create the same embeddings three times for the same library, but store them in three different embedding stores. ```python import logging from pathlib import Path from llmware.configs import LLMWareConfig from llmware.library import Library logging.info(f'Currently supported embedding stores: {LLMWareConfig().get_supported_vector_db()}') library = Library().create_new_library(library_name='embedding_store_example') library.add_files(input_foler_path=Path('~/llmware_data/sample_files/Agreements')) library.install_new_embedding(vector_db="pg_vector") library.install_new_embedding(vector_db="milvus") library.install_new_embedding(vector_db="faiss") ``` ### How can I set the collection store? #### "I want to use a specific collection store" You can set the collection store with the ``set_active_db`` method of the ``LLMWareConfig`` class. The collection store is set using the ``LLMWareConfig`` class with the ``set_active_db`` method. At the time of writing, **LLMWare** supports the three collection stores *MongoDB*, *Postgres*, and *SQLite* - which is the default. You can retrieve the supported collection store with the method ``get_supported_collection_db``. In the example below, we first print the currently active collection store, then we retrieve the supported collection stores, before we switch to *Postgres*. ```python import logging from llmware.configs import LLMWareConfig logging.info(f'Currently active collection store: {LLMWareConfig.get_active_db()}') logging.info(f'Currently supported collection stores: {LLMWareConfig().get_supported_collection_db()}') LLMWareConfig.set_active_db("postgres") logging.info(f'Currently active collection store: {LLMWareConfig.get_active_db()}') ``` ### How can I retrieve more context? #### "I want to retrieve more context from a query" One way to retrieve more context is to set the ``result_count`` parameter of the ``query``, ``text_query``, and ``semantic_query`` methods from the ``Query`` class. By increasing ``result_count``, the number of retrieved results is increased which increases the context size. The ``Query`` class has the methods ``query``, ``text_query``, and ``semantic_query`` methods which allow to set the number of retrieved results with ``result_count``. On a side note, ``query`` is a wrapper function for ``text_query`` and ``semantic_query``. The value of ``result_count`` is passed on to the queried embedding store to control the number of retrieved results. For example, for *pgvector* ``result_count`` is passed on to the value after the ``LIMIT`` keyword. In the ``SQL`` example below, you can see the resulting ``SQL`` query of ``LLMWare`` if ``result_count=10``, the name of the collection being ``agreements``, and the query vector being ``[1, 2, 3]``. ```sql SELECT id, block_mongo_id, embedding <-> '[1, 2, 3]' AS distance, text FROM agreements ORDER BY distance LIMIT 10; ``` In the following example, we execute the same query against a library twice but change the number of retrieved results from ``3`` to ``6``. ```python import logging from pathlib import Path from llmware.configs import LLMWareConfig from llmware.library import Library from llmware.retrieval import Query logging.info(f'Currently supported embedding stores: {LLMWareConfig().get_supported_vector_db()}') library = Library().create_new_library(library_name='context_size_example') library.add_files(input_foler_path=Path('~/llmware_data/sample_files/Agreements')) library.install_new_embedding(vector_db="pg_vector") query = Query(library) query_results = query.semantic_query(query='salary', result_count=3, results_only=True) logging.info(f'Number of results: {len(query_results)}') query_results = query.semantic_query(query='salary', result_count=6, results_only=True) logging.info(f'Number of results: {len(query_results)}') ``` ### How can I set the Large Language Model? #### "I want to use a different LLM" You can set the Large Language Model (LLM) with the ``gen_model`` parameter of the ``load_model`` method from the ``Prompt`` class. The ``Prompt`` class has the method ``load_model`` with the ``gen_model`` parameter which sets the LLM. The ``gen_model`` parameter is passed on to the ``ModelCatalog`` class, which loads the LLM either from HuggingFace or from another source. The ``ModelCatalog`` allows you to **list all available models** with the method ``list_generative_models``, or just the local models ``list_generative_local_models``, or just the open source models ``list_open_source_models``. In the example below, we log all available LLMs, including the ones that are available locally and the open source ones, and also create the prompters. Each prompter uses a different LLM from our [BLING model series](https://llmware.ai/about), which you can also find on [HuggingFace](https://huggingface.co/collections/llmware/bling-models-6553c718f51185088be4c91a). ```python import logging from llmware.models import ModelCatalog from llmware.prompts import Prompt llm_gen = ModelCatalog().list_generative_models() logging.info(f'List of all LLMs: {llm_gen}') llm_gen_local = ModelCatalog().list_generative_local_models() logging.info(f'List of all local LLMs: {llm_local}') llm_gen_open_source = ModelCatalog().list_open_source_models() logging.info(f'List of all open source LLMs: {llm_gen_open_source}') prompter_bling_1b = Prompt().load_model(gen_model='llmware/bling-1b-0.1') prompter_bling_tiny_llama = Prompt().load_model(gen_model='llmware/bling-tiny-llama-v0') prompter_bling_falcon_1b = Prompt().load_model(gen_model='llmware/bling-falcon-1b-0.1') ``` ### How can I set the embedding model? #### "I want to use a different embedding model" You can set the embedding model with the ``embedding_model_name`` parameter of the ``install_new_embedding`` method from the ``Library`` class. The ``Library`` class has the method ``install_new_embedding`` with the ``embedding_model_name`` parameter which sets the embedding model. The ``ModelCatalog`` allows you to **list all available embedding models** with the ``list_embedding_models`` method. In the following example, we list all available embedding models, and then we create a library with the name ``embedding_models_example``, which we embed two times with embedding models ``'mini-lm-sber'`` and ``'industry-bert-contracts'``. ```python import logging from llmware.models import ModelCatalog from llmware.library import Library embedding_models = ModelCatalog().list_generative_models() logging.info(f'List of embedding models: {embedding_models}') library = Library().create_new_library(library_name='embedding_models_example') library.add_files(input_foler_path=Path('~/llmware_data/sample_files/Agreements')) library.install_new_embedding(embedding_model_name='mini-lm-sber') library.install_new_embedding(embedding_model_name='industry-bert-contracts') ``` ### Why is the model running slowly in Google Colab? #### "I want to improve the performance of my model on Google Colab" Our models are designed to run on at least 16GB of RAM. By default Google Colab provides ~13GB of RAM, which significantly slows computational speed. To ensure the best performance when using our models, we highly recommend enabling the T4 GPU in Colab. This will provide the notebook with additional resources, including 16GB of RAM, allowing our models to run smoothly and efficiently. Steps to enabling T4 GPU in Colab: 1. In your Colab notebook, click on the "Runtime" tab 2. Select "Change runtime type" 3. Under "Hardware Accelerator", select T4 GPU NOTE: There is a weekly usage limit on using T4 for free. --- --- layout: default title: Join Our Community parent: Community nav_order: 4 description: overview of the major modules and classes of LLMWare permalink: /community/join_our_community --- # Join the LLMWare Community ___ # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discord channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``! --- --- --- --- layout: default title: Need Hep parent: Community nav_order: 3 description: overview of the major modules and classes of LLMWare permalink: /community/need_help --- # Need Help ___ # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discord channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``! --- --- --- --- layout: default title: Troubleshooting parent: Community nav_order: 2 description: overview of the major modules and classes of LLMWare permalink: /community/troubleshooting --- # Common Troubleshooting Issues ___ 1. **Can not install the pip package** -- Check your Python version. If using Python 3.9-3.11, then almost any version of llmware should work. If using an older Python (before 3.9), then it is likely that dependencies will fail in the pip process. If you are using Python 3.12, then you need to use llmware>=0.2.12. -- Dependency constraint error. If you receive a specific error around a dependency version constraint, then please raise an issue and include details about your OS, Python version, any unique elements in your virtual environment, and specific error. 2. **Parser module not found** -- Check your OS and confirm that you are using a [supported platform](platforms.md/#platform-support). -- If you cloned the repository, please confirm that the /lib folder has been copied into your local path. 3. **Pytorch Model not loading** -- Confirm the obvious stuff - correct model name, model exists in Huggingface repository, connected to the Internet with open ports for HTTPS connection, etc. -- Check Pytorch version - update Pytorch to >2.0, which is required for many recent models released in the last 6 months, and in some cases, may require other dependencies not included in the llmware package. --note: we have seen some compatibility issues with Pytorch==2.3 on Wintel platforms - if you run into these issues, we recommend using a back-level Pytorch==2.1, which we have seen fixing the issue. 4. **GGUF Model not loading** -- Confirm that you are using llmware>=0.2.11 for the latest GGUF support. -- Confirm that you are using a [supported platform](platforms.md/#platform-support). We provide pre-built binaries for llama.cpp as a back-end GGUF engine on the following platforms: - Mac M1/M2/M3 - OS version 14 - "with accelerate framework" - Mac M1/M2/M3 - OS older versions - "without accelerate framework" - Windows - x86 - Windows with CUDA - Linux - x86 (Ubuntu 20+) - Linux with CUDA (Ubuntu 20+) If you are using a different OS platform, you have the option to "bring your own llama.cpp" lib as follows: ```python from llmware.gguf_configs import GGUFConfigs GGUFConfigs().set_config("custom_lib_path", "/path/to/your/libllama_binary") ``` If you have any trouble, feel free to raise an Issue and we can provide you with instructions and/or help compiling llama.cpp for your platform. -- Specific GGUF model - if you are successfully using other GGUF models, and only having problems with a specific model, then please raise an Issue, and share the specific model and architecture. 5. **Example not working as expected** - please raise an issue, so we can evaluate and fix any bugs in the example code. Also, pull requests are always especially welcomed with a fix or improvement in an example. 6. **Model not leveraging CUDA available in environment.** -- **Check CUDA drivers installed correctly** - easy check of the NVIDIA CUDA drivers is to use `nvidia-smi` and `nvcc --version` from the command line. Both commands should respond positively with details on the versions and implementations. Any errors indicates that either the driver or CUDA toolkit are not installed or recognized. It can be complicated at times to debug the environment, usually with some trial and error. See extensive [Nvidia Developer documentation](https://docs.nvidia.com) for trouble-shooting steps, specific to your environment. -- **Check CUDA drivers are up to date** - we build to CUDA 12.1, which translates to a minimum of 525.60 on Linux, and 528.33 on Windows. -- **Pytorch model** - check that Pytorch is finding CUDA, e.g., `torch.cuda.is_available()` == True. We have seen issues on Windows, in particular, to confirm that your Pytorch version has been compiled with CUDA drivers. For Windows, in particular, we have found that you may need to compile a CUDA-specific version of Pytorch, using the following command: ```pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121``` -- **GGUF model** - logs will be displayed on the screen confirming that CUDA is being used, or whether 'fall-back' to CPU drivers. We run a custom CUDA install check, which you can run on your system with: ```gpu_status = ModelCatalog().gpu_available``` If you are confirming CUDA present, but fall-back to CPU is being used, you can set the GGUFConfigs to force to CUDA: ```GGUFConfigs().set_config("force_gpu", True)``` If you are looking to use specific optimizations, you can bring your own llama.cpp lib as follows: ```GGUFConfigs().set_config("custom_lib_path", "/path/to/your/custom/llama_cpp_backend")``` -- If you can not debug after these steps, then please raise an Issue. We are happy to dig in and work with you to run FAST local inference. 7. **Model result inconsistent** -- when loading the model, set `temperature=0.0` and `sample=False` -> this will give a deterministic output for better testing and debugging. -- usually the issue will be related to the retrieval step and formation of the Prompt, and as always, good pipelines and a little experimentation usually help ! 8. **Newly added examples not working as intended** -- If you run a recently added example and it does not run as intended, it is possible that the feature being used in the example has not yet been added to the latest pip install. -- To fix this, move the example file to the outer-most directory of the repository, so that the example file you are trying to run is in the same directory as the `llmware` source code directory. -- This will let you run the example using the latest source code! 9. **Git permission denied error** -- If you are using SSH to clone the repository and you get an error that looks similar to `git@github.com: Permission denied (publickey)`, then you might not have configured your SSH key correctly. -- If you don't already have one, you will need to create a new SSH key on your local machine. For instructions on how to do this, check out this page: https://docs.github.com/en/authentication/connecting-to-github-with-ssh/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent. -- You then need to add the SSH key to your GitHub account. For instructions on how to do this, check out this page: https://docs.github.com/en/authentication/connecting-to-github-with-ssh/adding-a-new-ssh-key-to-your-github-account. # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discord channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``! --- --- --- --- layout: default title: Agent Inference Server parent: Components nav_order: 12 description: overview of the major modules and classes of LLMWare permalink: /components/agent_inference_server --- # Agent Inference Server --- LLMWare supports multiple deployment options, including the use of REST APIs to implement most model invocations. To set up an inference server for Agent processes: ```python """ This example shows how to set up an inference server that can be used in conjunction with agent-based workflows. This script covers both the server-side deployment, as well as the steps taken on the client-side to deploy in an Agent example. Note: this example will build off two other examples: 1. "examples/Models/launch_llmware_inference_server.py" 2. "examples/SLIM-Agents/agent-llmfx-getting-started.py" """ from llmware.models import ModelCatalog, LLMWareInferenceServer # *** SERVER SIDE SCRIPT *** base_model = "llmware/bling-tiny-llama-v0" LLMWareInferenceServer(base_model, model_catalog=ModelCatalog(), secret_api_key="demo-test", home_path="/home/ubuntu/", verbose=True).start() # this will start Flask-based server, which will display the launched IP address and port, e.g., # "Running on " ip_address = "http://127.0.0.1:8080" # *** CLIENT SIDE AGENT PROCESS *** from llmware.agents import LLMfx def create_multistep_report_over_api_endpoint(): """ This is derived from the script in the example agent-llmfx-getting-started.py. """ customer_transcript = "My name is Michael Jones, and I am a long-time customer. " \ "The Mixco product is not working currently, and it is having a negative impact " \ "on my business, as we can not deliver our products while it is down. " \ "This is the fourth time that I have called. My account number is 93203, and " \ "my user name is mjones. Our company is based in Tampa, Florida." # create an agent using LLMfx class agent = LLMfx() # copy the ip address from the Flask launch readout ip_address = "http://127.0.0.1:8080" # inserting this line below into the agent process sets the 'api endpoint' execution to "ON" # all agent function calls will be deployed over the API endpoint on the remote inference server # to "switch back" to local execution, comment out this line agent.register_api_endpoint(api_endpoint=ip_address, api_key="demo-test", endpoint_on=True) # to explicitly turn the api endpoint "on" or "off" # agent.switch_endpoint_on() # agent.switch_endpoint_off() agent.load_work(customer_transcript) # load tools individually agent.load_tool("sentiment") agent.load_tool("ner") # load multiple tools agent.load_tool_list(["emotions", "topics", "intent", "tags", "ratings", "answer"]) # start deploying tools and running various analytics # first conduct three 'soft skills' initial assessment using 3 different models agent.sentiment() agent.emotions() agent.intent() # alternative way to execute a tool, passing the tool name as a string agent.exec_function_call("ratings") # call multiple tools concurrently agent.exec_multitool_function_call(["ner","topics","tags"]) # the 'answer' tool is a quantized question-answering model - ask an 'inline' question # the optional 'key' assigns the output to a dictionary key for easy consolidation agent.answer("What is a short summary?",key="summary") # prompting tool to ask a quick question as part of the analytics response = agent.answer("What is the customer's account number and user name?", key="customer_info") # you can 'unload_tool' to release it from memory agent.unload_tool("ner") agent.unload_tool("topics") # at end of processing, show the report that was automatically aggregated by key report = agent.show_report() # displays a summary of the activity in the process activity_summary = agent.activity_summary() # list of the responses gathered for i, entries in enumerate(agent.response_list): print("update: response analysis: ", i, entries) output = {"report": report, "activity_summary": activity_summary, "journal": agent.journal} return output ``` Need help or have questions? ============================ Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discord channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``! --- --- --- --- layout: default title: Agents parent: Components nav_order: 4 description: overview of the major modules and classes of LLMWare permalink: /components/agents --- # Agents --- Agents with Function Calls and SLIM Models 🔥 llmware has been designed to enable Agent and LLM-based function calls using small language models designed for local and private deployment and the ability to leverage open source models to conduct complex RAG and knowledge-based workflow automation. The key elements in llmware: - **SLIM models** - 18 function-calling small language models, optimized for a specific extraction, classification, generation, or summarization activity, and generate python dictionaries and lists as output. - **LLMfx class** - enables a wide range of agent-based processes. Here is an example to get started: ```python from llmware.agents import LLMfx text = ("Tesla stock fell 8% in premarket trading after reporting fourth-quarter revenue and profit that " "missed analysts’ estimates. The electric vehicle company also warned that vehicle volume growth in " "2024 'may be notably lower' than last year’s growth rate. Automotive revenue, meanwhile, increased " "just 1% from a year earlier, partly because the EVs were selling for less than they had in the past. " "Tesla implemented steep price cuts in the second half of the year around the world. In a Wednesday " "presentation, the company warned investors that it’s 'currently between two major growth waves.'") # create an agent using LLMfx class agent = LLMfx() # load text to process agent.load_work(text) # load 'models' as 'tools' to be used in analysis process agent.load_tool("sentiment") agent.load_tool("extract") agent.load_tool("topics") agent.load_tool("boolean") # run function calls using different tools agent.sentiment() agent.topics() agent.extract(params=["company"]) agent.extract(params=["automotive revenue growth"]) agent.xsum() agent.boolean(params=["is 2024 growth expected to be strong? (explain)"]) # at end of processing, show the report that was automatically aggregated by key report = agent.show_report() # displays a summary of the activity in the process activity_summary = agent.activity_summary() # list of the responses gathered for i, entries in enumerate(agent.response_list): print("update: response analysis: ", i, entries) output = {"report": report, "activity_summary": activity_summary, "journal": agent.journal} ``` Need help or have questions? ============================ Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discord channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``! --- --- --- --- layout: default title: Components nav_order: 3 has_children: true description: llmware key architectural components, modules and classes permalink: /components --- # LLMWare Architecture --- llmware is characterized by a logically integrated set of data pipelines involved in building LLM-based workflows, centered on two main sub-pipelines with high-level interfaces intended to provide an abstraction layer over individual 'end point' components to promote code re-use and the ability to easily 'swap' different components with minimal, if any, code change: **1. Knowledge Ingestion** - "creating Gen Ai Food" - ingesting and organizing unstructured information from a wide range of data sources, including each of the major steps: - Extracting and Parsing - Text Chunking - Indexing, Organizing and Storing - Embedding - Retrieval - Analytics and Reuse of Content - Combining with SQL Table and Other Structured Content **Core LLMWare classes**: **Library**, **Query** (retrieval module), **Parser**, **EmbeddingHandler** (embeddings module), **Graph**, **CustomTables** (resources module) and **Datasets** dataset_tools module). In many cases, it is easy to get things done in LLMWare using only **Library** and **Query** - which provide convenient interfaces into parsing and embedding such that most use cases will not require calling those classes directly. Supported document file types: pdf, pptx, docx, xlsx, txt, csv, html, jsonl, json, tsv, jpg, jpeg, png, wav, zip, md, mp3, mp4, m4a Key methods to know: - Ingest anything - `Library().add_files(input_folder_path="path/to/docs")` - Embed library - `Library().install_new_embedding(embedding_model_name="your embedding model", vector_db="your vector db")` - Run Query - `Query(library).query(query, query_type="semantic", result_count=20)` Top examples to get started: - [Parsing examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing) - ~14 stand-alone parsing examples for all common document types, including options for parsing in memory, outputting to JSON, parsing custom configured CSV and JSON files, running OCR on embedded images found in documents, table extraction, image extraction, text chunking, zip files, and web sources. - [Embedding examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Embedding) - ~15 stand-alone embedding examples to show how to use ~10 different vector databases and wide range of leading open source embedding models (including sentence transformers). - [Retrieval examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Retrieval) - ~10 stand-alone examples illustrating different query and retrieval techniques - semantic queries, text queries, document filters, page filters, 'hybrid' queries, author search, using query state, and generating bibliographies. - [Dataset examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Datasets) - ~5 stand-alone examples to show 'next steps' of how to leverage a Library to re-package content into various datasets and automated NLP analytics. - [Fast start example #1-Parsing](https://www.github.com/llmware-ai/llmware/tree/main/fast_start/example-1-create_first_library.py) - shows the basics of parsing. - [Fast start example #2-Embedding](https://www.github.com/llmware-ai/llmware/tree/main/fast_start/example-2-build_embeddings.py) - shows the basics of building embeddings. - [CustomTable examples](https://www.github.com/llmware-ai/llmware/tree/main/Structured_Tables) - ~5 examples to start building structured tables that can be used in conjunction with LLM-based workflows. **2. Model Prompting** - "Fun with LLMs" - the lifecycle of discovering, instantiating, and configuring an LLM-based model to execute an inference, including the ability to seamlessly prepare and integrate knowledge retrieval, and post-processing steps to validate accuracy, including: - ModelCatalog - discover, load and manage configuration - Inference - Function Calls - Prompts - Prompt with Sources - Fact Checking methods - Agent-based multi-step processes - Prompt History Core LLMWare classes: **ModelCatalog** (models module), **Prompt**, **LLMfx** (agents module). Key methods to know: - Discover Models - `ModelCatalog().list_all_models()` - Load Model - `model = ModelCatalog().load_model(model_name)` - Inference - `response = model.inference(prompt, add_context=context)` - Prompt - wraps the model class to provide easy source/retrieval management - LLMfx - wraps the model class for function-calling SLIM models for agent processes While ~17 individual model classes are exposed in the models module, for most use cases, we recommend working through the higher-level interface of ModelCatalog, as it promotes code re-use and the easy ability to swap models. In many pipelines, even ModelCatalog is not required to be called directly, as the Prompt class (knowledge retrieval) and LLMfx (agents and function calls) class provide seamless workflow capabilities and are built on top of the ModelCatalog. Top examples to get started: - [Models examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Models) - ~20 examples showing a wide range of different model inferences and use cases, including the ability to integrate Ollama models, OpenChat (e.g., LMStudio) models, using LLama-3 and Phi-3, bringing your own models into the ModelCatalog, and configuring sampling settings. - [Prompts examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Prompts) - ~5 examples that illustrate how to use Prompt as an integrated workflow for integrating knowledge sources, managing prompt history, and applying fact-checking. - [SLIM-Agents examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/SLIM-Agents) - ~20 examples showing how to build multi-model, multi-step Agent processes using locally-running SLIM function calling models. - [Fast start example #3-Prompts and Models](https://www.github.com/llmware-ai/llmware/tree/main/fast_start/example-3-prompts_and_models.py) - getting started with model inference. In addition, to support these two key pipelines, LLMWare has a set of supporting and enabling classes and methods, including: - resource module: CollectionRetrieval, CollectionWriter, PromptState, QueryState, and ParserState - provides an abstraction layer on top of underlying database repositories and separate state mechanisms for major classes. - gguf_configs module: GGUFConfigs - model_configs module: global_model_repo_catalog_list, global_model_finetuning_prompt_wrappers_lookup, global_default_prompt_catalog - util module: Utilities - setup module: Setup - status module: Status - exceptions module: LLMWare Exceptions - web_services module: classes for Wikipedia, YFinance, and WebSite extraction **End-to-End Use Cases** - we publish and maintain a number of end-to-end use cases in [examples/Use_Cases](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases) Need help or have questions? ============================ Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discord channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``! --- --- --- --- layout: default title: Data Stores parent: Components nav_order: 9 description: overview of the major modules and classes of LLMWare permalink: /components/data_stores --- # Data Stores --- Simple-to-Scale Database Options - integrated data stores from laptop to parallelized cluster. ```python from llmware.configs import LLMWareConfig # to set the collection database - mongo, sqlite, postgres LLMWareConfig().set_active_db("mongo") # to set the vector database (or declare when installing) # --options: milvus, pg_vector (postgres), redis, qdrant, faiss, pinecone, mongo atlas LLMWareConfig().set_vector_db("milvus") # for fast start - no installations required LLMWareConfig().set_active_db("sqlite") LLMWareConfig().set_vector_db("chromadb") # try also faiss and lancedb # for single postgres deployment LLMWareConfig().set_active_db("postgres") LLMWareConfig().set_vector_db("postgres") # to install mongo, milvus, postgres - see the docker-compose scripts as well as examples ``` Need help or have questions? ============================ Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discord channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``! --- --- --- --- layout: default title: Embedding Models parent: Components nav_order: 6 description: overview of the major modules and classes of LLMWare permalink: /components/embedding_models --- # Embedding Models --- llmware supports 30+ embedding models out of the box in the default ModelCatalog, with easy extensibility to add other popular open source embedding models from HuggingFace or Sentence Transformers. To get a list of the currently supported embedding models: ```python from llmware.models import ModelCatalog embedding_models = ModelCatalog().list_embedding_models() for i, models in enumerate(embedding_models): print(f"embedding models: {i} - {models}") ``` Supported popular models include: - Sentence Transformers - `all-MiniLM-L6-v2`, `all-mpnet-base-v2` - Jina AI - `jinaai/jina-embeddings-v2-base-en`, `jinaai/jina-embeddings-v2-small-en` - Nomic - `nomic-ai/nomic-embed-text-v1` - Industry BERT - `industry-bert-insurance`, `industry-bert-contracts`, `industry-bert-asset-management`, `industry-bert-sec`, `industry-bert-loans` - OpenAI - `text-embedding-ada-002`, `text-embedding-3-small`, `text-embedding-3-large` We also support top embedding models from BAAI, thenlper, llmrails/ember, Google, and Cohere. We are constantly looking to add new innovative open source models to this list so please let us know if you are looking for support for a specific embedding model, and usually within 1-2 days, we can test and add to the ModelCatalog. # Using an Embedding Model Embedding models in llmware can be installed directly by `ModelCatalog().load_model("model_name")`, but in most cases, the name of the embedding model will be passed to the `install_new_embedding` handler in the Library class when creating a new embedding. Once that is completed, the embedding model is captured in the Library metadata on the LibraryCard as part of the embedding record for that library, and as a result, often times, does not need to be used explicitly again, e.g., ```python from llmware.library import Library library = Library().create_new_library("my_library") # parses the content from the documents in the file path, text chunks and indexes in a text collection database library.add_files(input_folder_path="/local/path/to/my_files", chunk_size=400, max_chunk_size=600, smart_chunking=1) # creates embeddings - and keeps synchronized records of which text chunks have been embedded to enable incremental use library.install_new_embedding(embedding_model_name="jinaai/jina-embeddings-v2-small-en", vector_db="milvus", batch_size=100) ``` Once the embeddings are installed on the library, you can look up the embedding status to see the updated embeddings, and confirm that the model has been correctly captured: ```python from llmware.library import Library library = Library().load_library("my_library") embedding_record = library.get_embedding_status() print("\nupdate: embedding record - ", embedding_record) ``` And then you can run semantic retrievals on the Library, using the Query class in the retrievals module, e.g.: ```python from llmware.library import Library from llmware.retrieval import Query library = Library().load_library("my_library") # queries are constructed by creating a Query object, and passing a library as input query_results = Query(library).semantic_query("my query", result_count=20) for qr in query_results: print("my query results: ", qr) ``` Need help or have questions? ============================ Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discord channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``! --- --- --- --- layout: default title: GGUF parent: Components nav_order: 14 description: overview of the major modules and classes of LLMWare permalink: /components/gguf --- # GGUF --- llmware packages its own build of the llama.cpp backend engine to enable running quantized models in GGUF format, which provides an effective packaging to run small language models on both CPUs and GPUs, which fast loading and inference. The GGUF capability is implemented in the models.py module in the class `GGUFGenerativeModel` with an extensive set of interfaces and configurations provided in the gguf_configs.py module (which for most users and use cases do not need to adjusted). To use a GGUF model is the same as using any other model in the ModelCatalog, e.g., ```python from llmware.models import ModelCatalog gguf_model = ModelCatalog().load_model("phi-3-gguf") response = gguf_model.inference("What are the benefits of small specialized language models?") print("response: ", response) ``` # GGUF Platform Support Within the llmware library, we currently package 6 separate builds of the gguf llama.cpp engine for the following platforms: # Mac M1/M2/M3 - with Accelerate: "libllama_mac_metal.dylib" - without Accelerate: "libllama_mac_metal_no_acc.dylib" (note: if you have an old Mac OS installed, it may not have full Accelerate support) - By default on Mac M1/M2/M3, it will attempt to use the Accelerate (faster) back-end, and if that fails, then it will automatically revert to the no-acc version # Windows - CUDA version - CPU version - Will look for CUDA drivers, and if found, will try to use the CUDA build, but if that fails, then it will automatically revert to the CPU version. # Linux - CUDA version - CPU version - Will look for CUDA drivers, and if found, will try to use the CUDA build, but if that fails, then it will automatically revert to the CPU version. # Troubleshooting CUDA on Windows and Linux Requirement: Nvidia CUDA 12.1+ -- how to check: `nvcc --version` and `nvidia-smi` - if not found, then drivers are either not installed or not in $PATH and need to be configured -- if you have older drivers (e.g., v11), then you will need to update them. # Bring your own custom llama.cpp gguf backend If you have a unique system requirement, or are looking to optimize for a particular BLAS library with your own build, you can bring your own as follows: if you have a unique system requirement, you can build llama_cpp from source, and apply custom build settings - or find in the community a prebuilt llama_cpp library that matches your platform. Happy to help if you share the requirements. ```python from llmware.gguf_configs import GGUFConfigs GGUFConfigs().set_config("custom_lib_path", "/path/to/your/custom/llama_cpp_backend") # ... and then load and run the model as usual - the GGUF model class will look at this config and load the llama.cpp found at the custom lib path. ``` # Streaming GGUF ```python """ This example illustrates how to use the stream method for GGUF models for fast streaming of inference, especially for real-time chat interactions. Please note that the stream method has been implemented for GGUF models starting in llmware-0.2.13. This will be any model with GGUFGenerativeModel class, and generally includes models with names that end in "gguf". See also the chat UI example in the UI examples folder. We would recommend using a chat optimized model, and have included a representative list below. """ from llmware.models import ModelCatalog from llmware.gguf_configs import GGUFConfigs # sets an absolute output maximum for the GGUF engine - normally set by default at 256 GGUFConfigs().set_config("max_output_tokens", 1000) chat_models = ["phi-3-gguf", "llama-2-7b-chat-gguf", "llama-3-instruct-bartowski-gguf", "openhermes-mistral-7b-gguf", "zephyr-7b-gguf", "tiny-llama-chat-gguf"] model_name = chat_models[0] # maximum output can be set optionally at any number up to the "max_output_tokens" set model = ModelCatalog().load_model(model_name, max_output=500) text_out = "" token_count = 0 # prompt = "I am interested in gaining an understanding of the banking industry. What topics should I research?" prompt = "What are the benefits of small specialized LLMs?" # since model.stream provides a generator, then use as follows to consume the generator for streamed_token in model.stream(prompt): text_out += streamed_token if text_out.strip(): print(streamed_token, end="") token_count += 1 # final output text and token count print("\n\n***total text out***: ", text_out) print("\n***total tokens***: ", token_count) ``` Need help or have questions? ============================ Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``! --- --- --- --- layout: default title: Library parent: Components nav_order: 7 description: overview of the major modules and classes of LLMWare permalink: /components/library --- # Library: ingest, organize and index a collection of knowledge at scale - Parse, Text Chunk and Embed. --- Library is the main organizing construct for unstructured information in LLMWare. Users can create one large library with all types of different content, or can create multiple libraries with each library comprising a specific logical collection of information on a particular subject matter, project/case/deal, or even different accounts/users/departments. Each Library consists of the following components: 1. Collection on a Database - this is the core of the Library, and is created through parsing of documents, which are then automatically chunked and indexed in a text collection database. This is the basis for retrieval, and the collection that will be used as the basis for tracking any number of vector embeddings that can be attached to a library collection. 2. File archives - found in the llmware_data path, within Accounts, there is a folder structure for each Library. All file-based artifacts for the Library are organized in these folders, including copies of all files added in the library (very useful for retrieval-based applications), images extracted and indexed from the source documents, as well as derived artifacts such as nlp and knowledge graph and datasets. 3. Library Catalog - each Library is registered in the LibraryCatalog table, with a unique library_card that has the key attributes and statistics of the Library. When a Library object is passed to the Parser, the parser will automatically route all information into the Library structure. The Library also exposes convenience methods to easily install embeddings on a library, including tracking of incremental progress. To parse into a Library, there is the very useful convenience methods, "add_files" which will invoke the Parser, collate and route the files within a selected folder path, check for duplicate files, execute the parsing, text chunking and insertion into the database, and update all of the Library state automatically. Libraries are the main index constructs that are used in executing a Query. Pass the library object when constructing the Query object, and then all retrievals (text, semantic and hybrid) will be executed against the content in that Library only. ```python from llmware.library import Library # to parse and text chunk a set of documents (pdf, pptx, docx, xlsx, txt, csv, md, json/jsonl, wav, png, jpg, html) # step 1 - create a library, which is the 'knowledge-base container' construct # - libraries have both text collection (DB) resources, and file resources (e.g., llmware_data/accounts/{library_name}) # - embeddings and queries are run against a library lib = Library().create_new_library("my_library") # step 2 - add_files is the universal ingestion function - point it at a local file folder with mixed file types # - files will be routed by file extension to the correct parser, parsed, text chunked and indexed in text collection DB lib.add_files("/folder/path/to/my/files") # to install an embedding on a library - pick an embedding model and vector_db lib.install_new_embedding(embedding_model_name="mini-lm-sbert", vector_db="milvus", batch_size=500) # to add a second embedding to the same library (mix-and-match models + vector db) lib.install_new_embedding(embedding_model_name="industry-bert-sec", vector_db="chromadb", batch_size=100) # easy to create multiple libraries for different projects and groups finance_lib = Library().create_new_library("finance_q4_2023") finance_lib.add_files("/finance_folder/") hr_lib = Library().create_new_library("hr_policies") hr_lib.add_files("/hr_folder/") # pull library card with key metadata - documents, text chunks, images, tables, embedding record lib_card = Library().get_library_card("my_library") # see all libraries all_my_libs = Library().get_all_library_cards() ``` Need help or have questions? ============================ Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``! --- --- --- --- layout: default title: Model Catalog parent: Components nav_order: 2 description: overview of the major modules and classes of LLMWare permalink: /components/model_catalog --- # Model Catalog: Access all models the same way with easy lookup, regardless of underlying implementation. - 150+ Models in Catalog with 50+ RAG-optimized BLING, DRAGON and Industry BERT models - 18 SLIM function-calling small language models for Agent use cases - Full support for GGUF, HuggingFace, Sentence Transformers and major API-based models - Easy to extend to add custom models - see examples Generally, all models can be identified using either the `model_name` or `display_name`, which provides some flexibility to expose a more "UI friendly" name or an informal short-name for a commonly-used model. The default model list is implemented in the model_configs.py module, which is then generally accessed in the models.py module through the `ModelCatalog` class, which also provides the ability to add models of various types, over-write by loading a custom model catalog from json file, and other useful interfaces into the list of models. ```python from llmware.models import ModelCatalog from llmware.prompts import Prompt # all models accessed through the ModelCatalog models = ModelCatalog().list_all_models() # to use any model in the ModelCatalog - "load_model" method and pass the model_name parameter my_model = ModelCatalog().load_model("llmware/bling-phi-3-gguf") output = my_model.inference("what is the future of AI?", add_context="Here is the article to read") # to integrate model into a Prompt prompter = Prompt().load_model("llmware/bling-tiny-llama-v0") response = prompter.prompt_main("what is the future of AI?", context="Insert Sources of information") ``` # ADD a Custom GGUF to the ModelCatalog ```python import time import re from llmware.models import ModelCatalog from llmware.prompts import Prompt # Step 1 - register new gguf model - we will pick the popular LLama-2-13B-chat-GGUF ModelCatalog().register_gguf_model(model_name="TheBloke/Llama-2-13B-chat-GGUF-Q2", gguf_model_repo="TheBloke/Llama-2-13B-chat-GGUF", gguf_model_file_name="llama-2-13b-chat.Q2_K.gguf", prompt_wrapper="my_version_inst") # Step 2- if the prompt_wrapper is a standard, e.g., Meta's , then no need to do anything else # -- however, if the model uses a custom prompt wrapper, then we need to define that too # -- in this case, we are going to create our "own version" of the Meta wrapper ModelCatalog().register_new_finetune_wrapper("my_version_inst", main_start="", llm_start="") # Once we have completed these two steps, we are done - and can begin to use the model like any other prompter = Prompt().load_model("TheBloke/Llama-2-13B-chat-GGUF-Q2") question_list = ["I am interested in gaining an understanding of the banking industry. What topics should I research?", "What are some tips for creating a successful business plan?", "What are the best books to read for a class on American literature?"] for i, entry in enumerate(question_list): start_time = time.time() print("\n") print(f"query - {i + 1} - {entry}") response = prompter.prompt_main(entry) # Print results time_taken = round(time.time() - start_time, 2) llm_response = re.sub("[\n\n]", "\n", response['llm_response']) print(f"llm_response - {i + 1} - {llm_response}") print(f"time_taken - {i + 1} - {time_taken}") ``` # ADD an Ollama Model ```python from llmware.models import ModelCatalog # Step 1 - register your Ollama models in llmware ModelCatalog # -- these two lines will register: llama2 and mistral models # -- note: assumes that you have previously cached and installed both of these models with ollama locally # register llama2 ModelCatalog().register_ollama_model(model_name="llama2",model_type="chat",host="localhost",port=11434) # register mistral - note: if you are using ollama defaults, then OK to register with ollama model name only ModelCatalog().register_ollama_model(model_name="mistral") # optional - confirm that model was registered my_new_model_card = ModelCatalog().lookup_model_card("llama2") print("\nupdate: confirming - new ollama model card - ", my_new_model_card) # Step 2 - start using the Ollama model like any other model in llmware print("\nupdate: calling ollama llama 2 model ...") model = ModelCatalog().load_model("llama2") response = model.inference("why is the sky blue?") print("update: example #1 - ollama llama 2 response - ", response) # Tip: if you are loading 'llama2' chat model from Ollama, note that it is already included in # the llmware model catalog under a different name, "TheBloke/Llama-2-7B-Chat-GGUF" # the llmware model name maps to the original HuggingFace repository, and is a nod to "TheBloke" who has # led the popularization of GGUF - and is responsible for creating most of the GGUF model versions. # --llmware uses the "Q4_K_M" model by default, while Ollama generally prefers "Q4_0" print("\nupdate: calling Llama-2-7B-Chat-GGUF in llmware catalog ...") model = ModelCatalog().load_model("TheBloke/Llama-2-7B-Chat-GGUF") response = model.inference("why is the sky blue?") print("update: example #1 - [compare] - llmware / Llama-2-7B-Chat-GGUF response - ", response) # Now, let's try the Ollama Mistral model with a context passage model2 = ModelCatalog().load_model("mistral") context_passage= ("NASA’s rover Perseverance has gathered data confirming the existence of ancient lake " "sediments deposited by water that once filled a giant basin on Mars called Jerezo Crater, " "according to a study published on Friday. The findings from ground-penetrating radar " "observations conducted by the robotic rover substantiate previous orbital imagery and " "other data leading scientists to theorize that portions of Mars were once covered in water " "and may have harbored microbial life. The research, led by teams from the University of " "California at Los Angeles (UCLA) and the University of Oslo, was published in the " "journal Science Advances. It was based on subsurface scans taken by the car-sized, six-wheeled " "rover over several months of 2022 as it made its way across the Martian surface from the " "crater floor onto an adjacent expanse of braided, sedimentary-like features resembling, " "from orbit, the river deltas found on Earth.") response = model2.inference("What are the top 3 points?", add_context=context_passage) print("\nupdate: calling ollama mistral model ...") print("update: example #2 - ollama mistral response - ", response) # Step 3 - using the ollama discovery API - optional discovery = model2.discover_models() print("\nupdate: example #3 - checking ollama model manifest list: ", discovery) if len(discovery) > 0: # note: assumes tht you have at least one model registered in ollama -otherwise, may throw error for i, models in enumerate(discovery["models"]): print("ollama models: ", i, models) ``` # Add a LM Studio Model ```python from llmware.models import ModelCatalog from llmware.prompts import Prompt # one step process: add the open chat model to the Model Registry # key params: # model_name = "my_open_chat_model1" # api_base = uri_path to the proposed endpoint # prompt_wrapper = alpaca | | chat_ml | hf_chat | human_bot # -> Llama2-Chat # hf_chat -> Zephyr-Mistral # chat_ml -> OpenHermes - Mistral # human_bot -> Dragon models # model_type = "chat" (alternative: "completion") ModelCatalog().register_open_chat_model("my_open_chat_model1", api_base="http://localhost:1234/v1", prompt_wrapper="", model_type="chat") # once registered, you can invoke like any other model in llmware prompter = Prompt().load_model("my_open_chat_model1") response = prompter.prompt_main("What is the future of AI?") # you can (optionally) register multiple open chat models with different api_base and model attributes ModelCatalog().register_open_chat_model("my_open_chat_model2", api_base="http://localhost:5678/v1", prompt_wrapper="hf_chat", model_type="chat") ``` Need help or have questions? ============================ Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
--- --- --- --- layout: default title: Prompt with Sources parent: Components nav_order: 10 description: overview of the major modules and classes of LLMWare permalink: /components/prompt_with_sources --- # Prompt with Sources --- Prompt with Sources: the easiest way to combine knowledge retrieval with a LLM inference, and provides several high-level useful methods to easily integrate a retrieval/query/parsing step into a prompt to be used as a source for running an inference on a model. This is best illustrated with a simple example: ```python from llmware.prompts import Prompt # build a prompt and attach a model prompter = Prompt().load_model("llmware/bling-tiny-llama-v0") # add_source_document method: accepts any supported document type, parses the file, and creates text chunks # if a query is passed, then it will run a quick in-memory filtering search against the text chunks # the text chunks are packaged into sources with all of the accompanying metadata from the file, and made # available automatically in batches to be used in prompting - source = prompter.add_source_document("/folder/to/one/doc/", "filename", query="fast query") # to run inference with 'prompt with sources' -> source will be automatically added to the prompt responses = prompter.prompt_with_source("my query") # depending upon the size of the source (and batching relative to the model context window, there may be more than # a single inference run, so unpack potentially multiple responses for i, response in enumerate(responses): print("response: ", i, response) ``` # FACT CHECKING Using prompt_with_source also provides integrated fact-checking methods that use the packaged source information to validate key elements from the llm_response ```python from llmware.prompts import Prompt prompter = Prompt().load_model("bling-answer-tool", temperature=0.0, sample=False) # contract is parsed, text-chunked, and then filtered by "base salary' source = prompter.add_source_document("/local/folder/path", "my_document.pdf", query="exact filter query") # calling the LLM with 'source' information from the contract automatically packaged into the prompt responses = prompter.prompt_with_source("my question to the document", prompt_name="default_with_context") # run several fact checks # checks for numbers match ev_numbers = prompter.evidence_check_numbers(responses) # looks for statistical overlap to identify potential sources for the llm response ev_sources = prompter.evidence_check_sources(responses) # builds set of comparison stats between the llm_response and the sources ev_stats = prompter.evidence_comparison_stats(responses) # identifies if a response is a "not found" response z = prompter.classify_not_found_response(responses, parse_response=True, evidence_match=True,ask_the_model=False) for r, response in enumerate(responses): print("LLM Response: ", response["llm_response"]) print("Numbers: ", ev_numbers[r]["fact_check"]) print("Sources: ", ev_sources[r]["source_review"]) print("Stats: ", ev_stats[r]["comparison_stats"]) print("Not Found Check: ", z[r]) ``` In addition to `add_source_document`, the Prompt class implements the following other methods to easily integrate sources into prompts: # Add Source - Query Results - Two Options ```python from llmware.prompts import Prompt from llmware.retrieval import Query from llmware.library import Library # build a prompt prompter = Prompt().load_model("llmware/bling-tiny-llama-v0") # Option A - run query and then add query_results to the prompt my_lib = Library().load_library("my_library") results = Query(my_lib).query("my query") source2 = prompter.add_source_query_results(results) # Option B - run a new query against a library and load directly into a prompt source3 = prompter.add_source_new_query(my_lib, query="my new query", query_type="semantic", result_count=15) ``` # Add Other Sources ```python from llmware.prompts import Prompt # build a prompt prompter = Prompt().load_model("llmware/bling-tiny-llama-v0") # add wikipedia articles as a source wiki_source = prompter.add_source_wikipedia("topic", article_count=5, query="filter among retrieved articles") # add a website as a source website_source = prompter.add_source_website("my_url", query="filter among website") # add an entire library (should be small, e.g., just a couple of documents) source = prompter.add_source_library("my_library") ``` Need help or have questions? ============================ Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
--- --- --- --- layout: default title: Query parent: Components nav_order: 8 description: overview of the major modules and classes of LLMWare permalink: /components/query --- # Retrieval & Query --- Query libraries with mix of text, semantic, hybrid, metadata, and custom filters. The retrieval.py module implements the `Query` class, which is the primary way that search and retrieval is performed. Each `Query` object, when constructed, requires that a Library is passed as a mandatory parameter in the constructor. The Query object will operate against that Library, and have access to all of Library's specific attributes, metadata and methods. Retrievals in llmware leverage the Library abstraction as the primary unit against which a particular query or retrieval is executed. This provides the ability to have multiple distinct knowledge-bases, potentially aligned to different use cases, and/or users, accounts and permissions. # Executing Queries ```python from llmware.retrieval import Query from llmware.library import Library # step 1 - load a previously created library lib = Library().load_library("my_library") # step 2 - create a query object q = Query(lib) # step 3 - run lots of different queries (many other options in the examples) # basic text query results1 = q.text_query("text query", result_count=20, exact_mode=False) # semantic query results2 = q.semantic_query("semantic query", result_count=10) # combining a text query restricted to only certain documents in the library and "exact" match to the query results3 = q.text_query_with_document_filter("new query", {"file_name": "selected file name"}, exact_mode=True) # to apply a specific embedding (if multiple on library), pass the names when creating the query object q2 = Query(lib, embedding_model_name="mini_lm_sbert", vector_db="milvus") results4 = q2.semantic_query("new semantic query") ``` Need help or have questions? ============================ Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
--- --- --- --- layout: default title: RAG Optimized Models parent: Components nav_order: 3 description: overview of the major modules and classes of LLMWare permalink: /components/rag_optimized_models --- # RAG Optimized Models --- RAG-Optimized Models - 1-7B parameter models designed for RAG workflow integration and running locally. ## Meet our Models - **SLIM model series:** small, specialized models fine-tuned for function calling and multi-step, multi-model Agent workflows. - **DRAGON model series:** Production-grade RAG-optimized 6-7B parameter models - "Delivering RAG on ..." the leading foundation base models. - **BLING model series:** Small CPU-based RAG-optimized, instruct-following 1B-3B parameter models. - **Industry BERT models:** out-of-the-box custom trained sentence transformer embedding models fine-tuned for the following industries: Insurance, Contracts, Asset Management, SEC. - **GGUF Quantization:** we provide 'gguf' and 'tool' versions of many SLIM, DRAGON and BLING models, optimized for CPU deployment. ```python """ This 'Hello World' example demonstrates how to get started using local BLING models with provided context, using both Pytorch and GGUF versions. """ import time from llmware.prompts import Prompt def hello_world_questions(): test_list = [ {"query": "What is the total amount of the invoice?", "answer": "$22,500.00", "context": "Services Vendor Inc. \n100 Elm Street Pleasantville, NY \nTO Alpha Inc. 5900 1st Street " "Los Angeles, CA \nDescription Front End Engineering Service $5000.00 \n Back End Engineering" " Service $7500.00 \n Quality Assurance Manager $10,000.00 \n Total Amount $22,500.00 \n" "Make all checks payable to Services Vendor Inc. Payment is due within 30 days." "If you have any questions concerning this invoice, contact Bia Hermes. " "THANK YOU FOR YOUR BUSINESS! INVOICE INVOICE # 0001 DATE 01/01/2022 FOR Alpha Project P.O. # 1000"}, {"query": "What was the amount of the trade surplus?", "answer": "62.4 billion yen ($416.6 million)", "context": "Japan’s September trade balance swings into surplus, surprising expectations" "Japan recorded a trade surplus of 62.4 billion yen ($416.6 million) for September, " "beating expectations from economists polled by Reuters for a trade deficit of 42.5 " "billion yen. Data from Japan’s customs agency revealed that exports in September " "increased 4.3% year on year, while imports slid 16.3% compared to the same period " "last year. According to FactSet, exports to Asia fell for the ninth straight month, " "which reflected ongoing China weakness. Exports were supported by shipments to " "Western markets, FactSet added. — Lim Hui Jie"}, {"query": "When did the LISP machine market collapse?", "answer": "1987.", "context": "The attendees became the leaders of AI research in the 1960s." " They and their students produced programs that the press described as 'astonishing': " "computers were learning checkers strategies, solving word problems in algebra, " "proving logical theorems and speaking English. By the middle of the 1960s, research in " "the U.S. was heavily funded by the Department of Defense and laboratories had been " "established around the world. Herbert Simon predicted, 'machines will be capable, " "within twenty years, of doing any work a man can do'. Marvin Minsky agreed, writing, " "'within a generation ... the problem of creating 'artificial intelligence' will " "substantially be solved'. They had, however, underestimated the difficulty of the problem. " "Both the U.S. and British governments cut off exploratory research in response " "to the criticism of Sir James Lighthill and ongoing pressure from the US Congress " "to fund more productive projects. Minsky's and Papert's book Perceptrons was understood " "as proving that artificial neural networks approach would never be useful for solving " "real-world tasks, thus discrediting the approach altogether. The 'AI winter', a period " "when obtaining funding for AI projects was difficult, followed. In the early 1980s, " "AI research was revived by the commercial success of expert systems, a form of AI " "program that simulated the knowledge and analytical skills of human experts. By 1985, " "the market for AI had reached over a billion dollars. At the same time, Japan's fifth " "generation computer project inspired the U.S. and British governments to restore funding " "for academic research. However, beginning with the collapse of the Lisp Machine market " "in 1987, AI once again fell into disrepute, and a second, longer-lasting winter began."}, {"query": "What is the current rate on 10-year treasuries?", "answer": "4.58%", "context": "Stocks rallied Friday even after the release of stronger-than-expected U.S. jobs data " "and a major increase in Treasury yields. The Dow Jones Industrial Average gained 195.12 points, " "or 0.76%, to close at 31,419.58. The S&P 500 added 1.59% at 4,008.50. The tech-heavy " "Nasdaq Composite rose 1.35%, closing at 12,299.68. The U.S. economy added 438,000 jobs in " "August, the Labor Department said. Economists polled by Dow Jones expected 273,000 " "jobs. However, wages rose less than expected last month. Stocks posted a stunning " "turnaround on Friday, after initially falling on the stronger-than-expected jobs report. " "At its session low, the Dow had fallen as much as 198 points; it surged by more than " "500 points at the height of the rally. The Nasdaq and the S&P 500 slid by 0.8% during " "their lowest points in the day. Traders were unclear of the reason for the intraday " "reversal. Some noted it could be the softer wage number in the jobs report that made " "investors rethink their earlier bearish stance. Others noted the pullback in yields from " "the day’s highs. Part of the rally may just be to do a market that had gotten extremely " "oversold with the S&P 500 at one point this week down more than 9% from its high earlier " "this year. Yields initially surged after the report, with the 10-year Treasury rate trading " "near its highest level in 14 years. The benchmark rate later eased from those levels, but " "was still up around 6 basis points at 4.58%. 'We’re seeing a little bit of a give back " "in yields from where we were around 4.8%. [With] them pulling back a bit, I think that’s " "helping the stock market,' said Margaret Jones, chief investment officer at Vibrant Industries " "Capital Advisors. 'We’ve had a lot of weakness in the market in recent weeks, and potentially " "some oversold conditions.'"}, {"query": "Is the expected gross margin greater than 70%?", "answer": "Yes, between 71.5% and 72.%", "context": "Outlook NVIDIA’s outlook for the third quarter of fiscal 2024 is as follows:" "Revenue is expected to be $16.00 billion, plus or minus 2%. GAAP and non-GAAP " "gross margins are expected to be 71.5% and 72.5%, respectively, plus or minus " "50 basis points. GAAP and non-GAAP operating expenses are expected to be " "approximately $2.95 billion and $2.00 billion, respectively. GAAP and non-GAAP " "other income and expense are expected to be an income of approximately $100 " "million, excluding gains and losses from non-affiliated investments. GAAP and " "non-GAAP tax rates are expected to be 14.5%, plus or minus 1%, excluding any discrete items." "Highlights NVIDIA achieved progress since its previous earnings announcement " "in these areas: Data Center Second-quarter revenue was a record $10.32 billion, " "up 141% from the previous quarter and up 171% from a year ago. Announced that the " "NVIDIA® GH200 Grace™ Hopper™ Superchip for complex AI and HPC workloads is shipping " "this quarter, with a second-generation version with HBM3e memory expected to ship " "in Q2 of calendar 2024. "}, {"query": "What is Bank of America's rating on Target?", "answer": "Buy", "context": "Here are some of the tickers on my radar for Thursday, Oct. 12, taken directly from " "my reporter’s notebook: It’s the one-year anniversary of the S&P 500′s bear market bottom " "of 3,577. Since then, as of Wednesday’s close of 4,376, the broad market index " "soared more than 22%. Hotter than expected September consumer price index, consumer " "inflation. The Social Security Administration issues announced a 3.2% cost-of-living " "adjustment for 2024. Chipotle Mexican Grill (CMG) plans price increases. Pricing power. " "Cites consumer price index showing sticky retail inflation for the fourth time " "in two years. Bank of America upgrades Target (TGT) to buy from neutral. Cites " "risk/reward from depressed levels. Traffic could improve. Gross margin upside. " "Merchandising better. Freight and transportation better. Target to report quarter " "next month. In retail, the CNBC Investing Club portfolio owns TJX Companies (TJX), " "the off-price juggernaut behind T.J. Maxx, Marshalls and HomeGoods. Goldman Sachs " "tactical buy trades on Club names Wells Fargo (WFC), which reports quarter Friday, " "Humana (HUM) and Nvidia (NVDA). BofA initiates Snowflake (SNOW) with a buy rating." "If you like this story, sign up for Jim Cramer’s Top 10 Morning Thoughts on the " "Market email newsletter for free. Barclays cuts price targets on consumer products: " "UTZ Brands (UTZ) to $16 per share from $17. Kraft Heinz (KHC) to $36 per share from " "$38. Cyclical drag. J.M. Smucker (SJM) to $129 from $160. Secular headwinds. " "Coca-Cola (KO) to $59 from $70. Barclays cut PTs on housing-related stocks: Toll Brothers" "(TOL) to $74 per share from $82. Keeps underweight. Lowers Trex (TREX) and Azek" "(AZEK), too. Goldman Sachs (GS) announces sale of fintech platform and warns on " "third quarter of 19-cent per share drag on earnings. The buyer: investors led by " "private equity firm Sixth Street. Exiting a mistake. Rise in consumer engagement for " "Spotify (SPOT), says Morgan Stanley. The analysts hike price target to $190 per share " "from $185. Keeps overweight (buy) rating. JPMorgan loves elf Beauty (ELF). Keeps " "overweight (buy) rating but lowers price target to $139 per share from $150. " "Sees “still challenging” environment into third-quarter print. The Club owns shares " "in high-end beauty company Estee Lauder (EL). Barclays upgrades First Solar (FSLR) " "to overweight from equal weight (buy from hold) but lowers price target to $224 per " "share from $230. Risk reward upgrade. Best visibility of utility scale names."}, {"query": "What was the rate of decline in 3rd quarter sales?", "answer": "20% year-on-year.", "context": "Nokia said it would cut up to 14,000 jobs as part of a cost cutting plan following " "third quarter earnings that plunged. The Finnish telecommunications giant said that " "it will reduce its cost base and increase operation efficiency to “address the " "challenging market environment. The substantial layoffs come after Nokia reported " "third-quarter net sales declined 20% year-on-year to 4.98 billion euros. Profit over " "the period plunged by 69% year-on-year to 133 million euros."}, {"query": "What is a list of the key points?", "answer": "•Stocks rallied on Friday with stronger-than-expected U.S jobs data and increase in " "Treasury yields;\n•Dow Jones gained 195.12 points;\n•S&P 500 added 1.59%;\n•Nasdaq Composite rose " "1.35%;\n•U.S. economy added 438,000 jobs in August, better than the 273,000 expected;\n" "•10-year Treasury rate trading near the highest level in 14 years at 4.58%.", "context": "Stocks rallied Friday even after the release of stronger-than-expected U.S. jobs data " "and a major increase in Treasury yields. The Dow Jones Industrial Average gained 195.12 points, " "or 0.76%, to close at 31,419.58. The S&P 500 added 1.59% at 4,008.50. The tech-heavy " "Nasdaq Composite rose 1.35%, closing at 12,299.68. The U.S. economy added 438,000 jobs in " "August, the Labor Department said. Economists polled by Dow Jones expected 273,000 " "jobs. However, wages rose less than expected last month. Stocks posted a stunning " "turnaround on Friday, after initially falling on the stronger-than-expected jobs report. " "At its session low, the Dow had fallen as much as 198 points; it surged by more than " "500 points at the height of the rally. The Nasdaq and the S&P 500 slid by 0.8% during " "their lowest points in the day. Traders were unclear of the reason for the intraday " "reversal. Some noted it could be the softer wage number in the jobs report that made " "investors rethink their earlier bearish stance. Others noted the pullback in yields from " "the day’s highs. Part of the rally may just be to do a market that had gotten extremely " "oversold with the S&P 500 at one point this week down more than 9% from its high earlier " "this year. Yields initially surged after the report, with the 10-year Treasury rate trading " "near its highest level in 14 years. The benchmark rate later eased from those levels, but " "was still up around 6 basis points at 4.58%. 'We’re seeing a little bit of a give back " "in yields from where we were around 4.8%. [With] them pulling back a bit, I think that’s " "helping the stock market,' said Margaret Jones, chief investment officer at Vibrant Industries " "Capital Advisors. 'We’ve had a lot of weakness in the market in recent weeks, and potentially " "some oversold conditions.'"} ] return test_list # this is the main script to be run def bling_meets_llmware_hello_world (model_name): t0 = time.time() # load the questions test_list = hello_world_questions() print(f"\n > Loading Model: {model_name}...") # load the model prompter = Prompt().load_model(model_name) t1 = time.time() print(f"\n > Model {model_name} load time: {t1-t0} seconds") for i, entries in enumerate(test_list): print(f"\n{i+1}. Query: {entries['query']}") # run the prompt output = prompter.prompt_main(entries["query"],context=entries["context"] , prompt_name="default_with_context",temperature=0.30) # print out the results llm_response = output["llm_response"].strip("\n") print(f"LLM Response: {llm_response}") print(f"Gold Answer: {entries['answer']}") print(f"LLM Usage: {output['usage']}") t2 = time.time() print(f"\nTotal processing time: {t2-t1} seconds") return 0 if __name__ == "__main__": # list of 'rag-instruct' laptop-ready small bling models on HuggingFace pytorch_models = ["llmware/bling-1b-0.1", # most popular "llmware/bling-tiny-llama-v0", # fastest "llmware/bling-1.4b-0.1", "llmware/bling-falcon-1b-0.1", "llmware/bling-cerebras-1.3b-0.1", "llmware/bling-sheared-llama-1.3b-0.1", "llmware/bling-sheared-llama-2.7b-0.1", "llmware/bling-red-pajamas-3b-0.1", "llmware/bling-stable-lm-3b-4e1t-v0", "llmware/bling-phi-3" # most accurate (and newest) ] # Quantized GGUF versions generally load faster and run nicely on a laptop with at least 16 GB of RAM gguf_models = ["bling-phi-3-gguf", "bling-stablelm-3b-tool", "dragon-llama-answer-tool", "dragon-yi-answer-tool", "dragon-mistral-answer-tool"] # try model from either pytorch or gguf model list # the newest (and most accurate) is 'bling-phi-3-gguf' bling_meets_llmware_hello_world(gguf_models[0]) # check out the model card on Huggingface for RAG benchmark test performance results and other useful information ``` Need help or have questions? ============================ Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
--- --- --- --- layout: default title: Release History parent: Components nav_order: 15 description: llmware is an integrated framework with over 50+ models for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows. permalink: /components/release_history --- Release History --- - For Specific Wheels: [Wheel Archives](https://www.github.com/llmware-ai/llmware/tree/main/wheel_archives) - For Features Details: [Main README-'Release notes and Change Log'](https://www.github.com/llmware-ai/llmware/tree/main/) New wheels are built generally on PyPy on a weekly basis and updated on PyPy versioning. The development repo is updated and current at all times, but may have updates that are not yet in the PyPy wheel. All wheels are built and tested on: 1. Mac Metal 2. Windows x86 (+ with CUDA) 3. Linux x86 (+ with CUDA) - most testing on Ubuntu 22 and Ubuntu 20 - which are recommended. 4. Mac x86 (see 0.2.11 note below) 5. Linux aarch64* (see 0.2.7 note below) **Release Notes** --**0.3.0** released in the week of June 4, 2024 - continued pruning of required dependencies with split of python dependencies into a small minimal set of requirements (~10 in requirements.txt) that are included in the pip install, with an additional set of optional dependencies provided as 'extras', reflected in both the requirements_extras.txt file, and available over pip with the added instruction - `pip3 install 'llmware[full]'`. Notably, commonly used libraries such as transformers, torch and openai are now in the 'extras' as most llmware use cases do not require them, and this greatly simplifies the ability to install llmware. The `welcome_to_llmware.sh` and `welcome_to_llmware_windows.sh` have also been updated to install both the 'core' and 'extra' set of requirements. Other subtle, but significant, architectural changes include offering more extensibility for adding new model classes, configurable global base model methods for post_init and register, a new InferenceHistory state manager, and enhanced logging options. --**0.2.15** released in the week of May 20, 2024 - removed pytorch dependency as a global import, and shifted to dynamically loading of torch in the event that it is called in a specific model class. This enables running most of llmware code and examples without pytorch or transformers loaded. The main areas of torch (and transformers) dependency is in using HFGenerativeModels and HFEmbeddingModels. - note: we have seen some new errors caused with Pytorch 2.3 - which are resolved by down-leveling to `pip3 install torch==2.1` - note: there are a couple of new warnings from within transformers and huggingface_hub libraries - these can be safely ignored. We have seen that keeping `local_dir_use_symlinks = False` when pulling model artifacts from Huggingface is still the safer option in some environments. --**0.2.13** released in the week of May 12, 2024 - clean up of dependencies in both requirements.txt and Setup (PyPi) - install of vector db python sdk (e.g., pymilvus, chromadb, etc) is now required as a separate step outside of the pip3 install llmware - attempt to keep dependency matrix as simple as possible and avoid potential dependency conflicts on install, especially for packages which in turn have a large number of dependencies. If you run into any issues with install dependencies, please raise an issue. --**0.2.12** released in the week of May 5, 2024 - added Python 3.12 support, and deprecated the use of faiss for v3.12+. We have changed the "Fast Start" no-install option to use chromadb or lancedb rather than faiss. Refactoring of code especially with Datasets, Graph and Web Services as separate modules. --**0.2.11** released in the week of April 29, 2024 - updated GGUF libs for Phi-3 and Llama-3 support, and added new prebuilt shared libraries to support WhisperCPP. We are also deprecating support for Mac x86 going forward - will continue to support on most major components but not all new features going forward will be built specifically for Mac x86 (which Apple stopped shipping in 2022). Our intent is to keep narrowing our testing matrix to provide better support on key platforms. We have also added better safety checks for older versions of Mac OS running on M1/M2/M3 (no_acc option in GGUF and Whisper libs), as well as a custom check to find CUDA drivers on Windows (independent of Pytorch). --**0.2.9** released in the week of April 15, 2024 - minor continued improvements to the parsers plus roll-out of new CustomTable class for rapidly integrating structured information into LLM-based workflows and data pipelines, including converting JSON/JSONL files and CSV files into structured DB tables. --**0.2.8** released in the week of April 8, 2024 - significant improvements to the Office parser with new libs on all platforms. Conforming changes with the PDF parser in terms of exposing more options for text chunking strategies, encoding, and range of capture options (e.g., tables, images, header text, etc). Linux aarch64 libs deprecated and kept at 0.2.6 - some new features will not be available on Linux aarch64 - we recommend using Ubuntu20+ on x86_64 (with and without CUDA). --**0.2.7** released in the week of April 1, 2024 - significant improvements to the PDF parser with new libs on all platforms. Important note that we are keeping linux aarch64 at 0.2.6 libs - and will be deprecating support going forward. For Linux, we recommend Ubuntu20+ and x86_64 (with and without CUDA). --**0.2.5** released in the week of March 12, 2024 - continued enhancements of the GGUF implementation, especially for CUDA support, and re-compiling of all binaries to support Ubuntu 20 and Ubuntu 22. Ubuntu requirements are: CUDA 12.1 (to use GPU), and GLIBC 2.31+. --**GGUF on Windows CUDA**: useful notes and debugging tips - 1. Requirement: Nvidia CUDA 12.1+ -- how to check: `nvcc --version` and `nvidia-smi` - if not found, then drivers are either not installed or not in $PATH and need to be configured -- if you have older drivers (e.g., v11), then you will need to update them. 2. Requirement: CUDA-enabled Pytorch (pre-0.2.11) -- starting with 0.2.11, we have implemented a custom check to evaluate if CUDA is present, independent of Pytorch. -- for pre-0.2.11, we use Pytorch to check for CUDA drivers, e.g., `torch.cuda.is_available()` and `torch.version.cuda` 3. Installing a CUDA-enabled Pytorch - useful install script: (not required post-0.2.11 for GGUF on Windows) -- `pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121` 4. Fall-back to CPU - if llmware can not load the CUDA-enabled drivers, it will automatically try to fall back to the CPU version of the drivers. -- you can also adjust the GGUFConfigs().set_config - ("use_gpu", False) - and then it will automatically go to the CPU drivers. 5. Custom GGUF libraries - if you have a unique system requirement, you can build llama_cpp from source, and apply custom build settings - or find in the community a prebuilt llama_cpp library that matches your platform. Happy to help if you share the requirements. -- to "bring your own GGUF": GGUFConfigs().set_config("custom_lib_path", "/path/to/your/custom/llama_cpp_backend" -> and llmware will try to load that library. 6. Issues? - please raise an Issue on Github, or on Discord - and we can work with you to get you up and running! --**0.2.4** released in the week of February 26, 2024 - major upgrade of GGUF implementation to support more options, including CUDA support - which is the main source of growth in the size of the wheel package. -- Note: We will look at making some of the CUDA builds as 'optional' or 'bring your own' over time. -- Note: We will also start to 'prune' the list of wheels kept in the archive to keep the total repo size manageable for cloning. --**0.2.2** introduced SLIM models and the new LLMfx class, and the capabilities for multi-model, multi-step Agent-based processes. --**0.2.0** released in the week of January 22, 2024 - significant enhancements, including integration of Postgres and SQLite drivers into the c lib parsers. --New examples involving Postgres or SQLite support (including 'Fast Start' examples) will require a fresh pip install of 0.2.0 or clone of the repo. --If cloning the repo, please be especially careful to pick up the new updated /lib dependencies for your platform. --New libs have new dependencies in Linux in particular - most extensive testing on Ubuntu 22. If any issues on a specific version of Linux, please raise a ticket. # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
--- --- --- --- layout: default title: SLIM Models parent: Components nav_order: 5 description: overview of the major modules and classes of LLMWare permalink: /components/slim_models --- # SLIM Models - Function Calling with Small Language Models --- Generally, function-calling is a specialized capability of frontier language models, such as OpenAI GPT4. We have adapted this concept to small language models through SLIMs (Structured Language Instruction Models), which are 'single function' models fine-tuned to accept three main inputs to construct a prompt: As of June 2024, there are 18 distinct SLIM function calling models with many more on the way, for most common extraction, classification, and summarization tasks: **Models List** If you would like more information about any of the SLIM models, please check out their model card: - extract - extract custom keys - [slim-extract](https://www.huggingface.co/llmware/slim-extract) & [slim-extract-tool](https://www.huggingface.co/llmware/slim-extract-tool) - summary - summarize function call - [slim-summary](https://www.huggingface.co/llmware/slim-summary) & [slim-summary-tool](https://www.huggingface.co/llmware/slim-summary-tool) - xsum - title/headline function call - [slim-xsum](https://www.huggingface.co/llmware/slim-xsum) & [slim-xsum-tool](https://www.huggingface.co/llmware/slim-xsum-tool) - ner - extract named entities - [slim-ner](https://www.huggingface.co/llmware/slim-ner) & [slim-ner-tool](https://www.huggingface.co/llmware/slim-ner-tool) - sentiment - evaluate sentiment - [slim-sentiment](https://www.huggingface.co/slim-sentiment) & [slim-sentiment-tool](https://www.huggingface.co/llmware/slim-sentiment-tool) - topics - generate topic - [slim-topics](https://www.huggingface.co/slim-topics) & [slim-topics-tool](https://www.huggingface.co/llmware/slim-topics-tool) - sa-ner - combo model (sentiment + named entities) - [slim-sa-ner](https://www.huggingface.co/slim-sa-ner) & [slim-sa-ner-tool](https://www.huggingface.co/llmware/slim-sa-ner-tool) - boolean - provides a yes/no output with explanation - [slim-boolean](https://www.huggingface.co/slim-boolean) & [slim-boolean-tool](https://www.huggingface.com/llmware/slim-boolean-tool) - ratings - apply 1 (low) - 5 (high) rating - [slim-ratings](https://www.huggingface.co/slim-ratings) & [slim-ratings-tool](https://www.huggingface.co/llmware/slim-ratings-tool) - emotions - assess emotions - [slim-emotions](https://www.huggingface.co/slim-emotions) & [slim-emotions-tool](https://www.huggingface.co/llmware/slim-emotions-tool) - tags - auto-generate list of tags - [slim-tags](https://www.huggingface.co/slim-tags) & [slim-tags-tool](https://www.huggingface.co/llmware/slim-tags-tool) - tags-3b - enhanced auto-generation tagging model - [slim-tags-3b](https://www.huggingface.com/slim-tags-3b) & [slim-tags-3b-tool](https://www.huggingface.co/llmware/slim-tags-3b-tool) - intent - identify intent - [slim-intent](https://www.huggingface.co/slim-intent) & [slim-intent-tool](https://www.huggingface.co/llmware/slim-intent-tool) - category - high-level category - [slim-category](https://www.huggingface.co/slim-category) & [slim-category-tool](https://wwww.huggingface.co/llmware/slim-category-tool) - nli - assess if evidence supports conclusion - [slim-nli](https://www.huggingface.co/slim-nli) & [slim-nli-tool](https://www.huggingface.co/llmware/slim-nli-tool) - sql - convert text into sql - [slim-sql](https://www.huggingface.co/slim-sql) & [slim-sql-tool](https://www.huggingface.co/llmware/slim-sql-tool) You may also want to check out these quantized 'answer' tools, which work well in conjunction with SLIMs for question-answer and summarization: - bling-stablelm-3b-tool - 3b quantized RAG model - [bling-stablelm-3b-gguf](https://www.huggingface.co/llmware/bling-stablelm-3b-gguf) - bling-answer-tool - 1b quantized RAG model - [bling-answer-tool](https://www.huggingface.co/llmware/bling-answer-tool) - dragon-yi-answer-tool - 6b quantized RAG model - [dragon-yi-answer-tool](https://www.huggingface.co/llmware/dragon-yi-answer-tool) - dragon-mistral-answer-tool - 7b quantized RAG model - [dragon-mistral-answer-tool](https://www.huggingface.co/llmware/dragon-mistral-answer-tool) - dragon-llama-answer-tool - 7b quantized RAG model - [dragon-llama-answer-tool](https://www.huggingface.co/llmware/dragon-llama-answer-tool) All SLIM models have a common prompting structure Inputs: -- text passage - this is the core passage or piece of text that you would like the model to assess -- function - classify, extract, generate - this is handled by default by the model class, so usually does not need to be explicitly declared - but is an option for SLIMs that support more than one function -- params - depends upon the model, used to configure/guide the behavior of the function call - optional for some SLIMs Outputs: -- structured python output, generally either a dictionary or list Main objectives: -- enable function calling with small, locally-running models, -- simplify prompts by defining specific functions and fine-tuning the model to respond accordingly without 'prompt magic' -- standardized outputs that can be handled programmatically as part of a multi-step workflow. ```python from llmware.models import ModelCatalog def discover_slim_models(): """ Discover a list of SLIM tools in the Model Catalog. -- SLIMs are available in both traditional Pytorch and quantized GGUF packages. -- Generally, we train/fine-tune in Pytorch and then package in 4-bit quantized GGUF for inference. -- By default, we designate the GGUF versions with 'tool' or 'gguf' in their names. -- GGUF versions are generally faster to load, faster for inference and use less memory in most environments.""" tools = ModelCatalog().list_llm_tools() tool_map = ModelCatalog().get_llm_fx_mapping() print("\nList of SLIM model tools (GGUF) in the ModelCatalog\n") for i, tool in enumerate(tools): model_card = ModelCatalog().lookup_model_card(tool_map[tool]) print(f"{i} - tool: {tool} - " f"model_name: {model_card['model_name']} - " f"model_family: {model_card['model_family']}") return 0 def hello_world_slim(): """ SLIM models can be identified in the ModelCatalog like any llmware model. Instead of using inference method, SLIM models are used with the function_call method that prepares a special prompt instruction, and takes optional parameters. This example shows a series of function calls with different SLIM models. Please note that the first time the models will be pulled from the llmware Huggingface repository, and will take a couple of minutes. Future calls will be much faster once cached in memory locally. """ print("\nExecuting Function Call Inferences with SLIMs\n") # Sentiment Analysis passage1 = ("This is one of the best quarters we can remember for the industrial sector " "with significant growth across the board in new order volume, as well as price " "increases in excess of inflation. We continue to see very strong demand, especially " "in Asia and Europe. Accordingly, we remain bullish on the tier 1 suppliers and would " "be accumulating more stock on any dips.") # here are the two key lines of code model = ModelCatalog().load_model("slim-sentiment-tool") response = model.function_call(passage1) print("sentiment response: ", response['llm_response']) # Named Entity Recognition passage2 = "Michael Johnson was a famous Olympic sprinter from the U.S. in the early 2000s." model = ModelCatalog().load_model("slim-ner-tool") response = model.function_call(passage2) print("ner response: ", response['llm_response']) # Extract anything with Slim-extract passage3 = ("Adobe shares tumbled as much as 11% in extended trading Thursday after the design software maker " "issued strong fiscal first-quarter results but came up slightly short on quarterly revenue guidance. " "Here’s how the company did, compared with estimates from analysts polled by LSEG, formerly known as Refinitiv: " "Earnings per share: $4.48 adjusted vs. $4.38 expected Revenue: $5.18 billion vs. $5.14 billion expected " "Adobe’s revenue grew 11% year over year in the quarter, which ended March 1, according to a statement. " "Net income decreased to $620 million, or $1.36 per share, from $1.25 billion, or $2.71 per share, " "in the same quarter a year ago. During the quarter, Adobe abandoned its $20 billion acquisition of " "design software startup Figma after U.K. regulators found competitive concerns. The company paid " "Figma a $1 billion termination fee.") model = ModelCatalog().load_model("slim-extract-tool") response = model.function_call(passage3, function="extract", params=["revenue growth"]) print("extract response: ", response['llm_response']) # Generate questions with Slim-Q-Gen model = ModelCatalog().load_model("slim-q-gen-tiny-tool", temperature=0.2, sample=True) # supported params - "question", "multiple choice", "boolean" response = model.function_call(passage3, params=['multiple choice']) print("question generation response: ", response['llm_response']) # Generate topic model = ModelCatalog().load_model("slim-topics-tool") response = model.function_call(passage3) print("topics response: ", response['llm_response']) # Generate headline summary with slim-xsum model = ModelCatalog().load_model("slim-xsum-tool", temperature=0.0, sample=False) response = model.function_call(passage3) print("xsum response: ", response['llm_response']) # Generate boolean with optional '(explain)` in parameter model = ModelCatalog().load_model("slim-boolean-tool") response = model.function_call(passage3, params=["Did Adobe revenue increase? (explain)"]) print("boolean response: ", response['llm_response']) # Generate tags model = ModelCatalog().load_model("slim-tags-tool", temperature=0.0, sample=False) response = model.function_call(passage3) print("tags response: ", response['llm_response']) return 0 def using_logits_and_integrating_into_process(): """ This example shows two key elements of function calling SLIM models - 1. Using Logit Information to indicate confidence levels, especially for classifications. 2. Using the structured dictionary generated for programmatic handling in a larger process. """ print("\nExample: using logits and integrating into process\n") text_passage = ("On balance, this was an average result, with earnings in line with expectations and " "no big surprises to either the positive or the negative.") # two key lines (load_model + execute function_call) + additional logit_analysis step sentiment_model = ModelCatalog().load_model("slim-sentiment-tool", get_logits=True) response = sentiment_model.function_call(text_passage) analysis = ModelCatalog().logit_analysis(response,sentiment_model.model_card, sentiment_model.hf_tokenizer_name) print("sentiment response: ", response['llm_response']) print("\nAnalyzing response") for keys, values in analysis.items(): print(f"{keys} - {values}") # two key attributes of the sentiment output dictionary sentiment_value = response["llm_response"]["sentiment"] confidence_level = analysis["confidence_score"] # use the sentiment classification as a 'if...then' decision point in a process if "positive" in sentiment_value: print("sentiment is positive .... will take 'positive' analysis path ...", sentiment_value) else: print("sentiment is negative .... will take 'negative' analysis path ...", sentiment_value) if "positive" in sentiment_value and confidence_level > 0.8: print("sentiment is positive with high confidence ... ", sentiment_value, confidence_level) return 0 if __name__ == "__main__": # discovering slim models in the llmware catalog discover_slim_models() # running function call inferences hello_world_slim() # doing interesting stuff with the output using_logits_and_integrating_into_process() ``` Need help or have questions? ============================ Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
--- --- --- --- layout: default title: Vector Databases parent: Components nav_order: 11 description: overview of the major modules and classes of LLMWare permalink: /components/vector_databases --- # Vector Databases --- llmware supports the following vector databases: - Milvus and Milvus-Lite - `milvus` - Postgres (PG Vector) - `postgres` - Qdrant - `qdrant` - ChromaDB - `chromadb` - Redis - `redis` - Neo4j - `neo4j` - LanceDB - `lancedb` - FAISS - `faiss` - Mongo-Atlas - `mongo-atlas` - Pinecone - `pinecone` In llmware, unstructured content is ingested and organized into a Library, and then embeddings are created against the Library object, and usually, handled by implicitly through the Library method `.install_new_embedding`. All embedding models are implemented through the embeddings.py module, and the `EmbeddingHandler` class, which routes the embedding process to the vector db specific handler and provides a common set of utility functions. In most cases, it is not necessarily to explicitly call the vector db class. The design is intended to promote code re-use and to make it easy to experiment with different endpoint vector databases without significant code changes, as well as to leverage the Library as the core organizing construct. # Select Vector DB To select a vector database in llmware is generally done is one of two ways: 1. Explicit Setting - `LLMWareConfig().set_vector_db("postgres")` 2. Pass the name of the vector database at the time of installing the embeddings: `library.install_new_embedding(embedding_model_name=embedding_model, vector_db='milvus',batch_size=100)` # Install Vector DB No-install options: chromadb, lancedb, faiss, and milvus-lite API-based options: mongo-atlas, pinecone Install server options: Generally, we have found that Docker (and Docker-Compose) are the easiest and most consistent ways to install vector db across different platforms. 1. milvus - we provide a docker-compose script in the main repository root folder path, which installs mongodb as well. ```bash curl -o docker-compose.yaml https://raw.githubusercontent.com/llmware-ai/llmware/main/docker-compose_mongo_milvus.yaml docker compose up -d ``` 2. qdrant ```bash curl -o docker-compose.yaml https://raw.githubusercontent.com/llmware-ai/llmware/main/docker-compose-qdrant.yaml docker compose up -d ``` 3. postgres and pgvector ```bash curl -o docker-compose.yaml https://raw.githubusercontent.com/llmware-ai/llmware/main/docker-compose-pgvector.yaml docker compose up -d ``` 4. redis ```bash # scripts to deploy other options curl -o docker-compose.yaml https://raw.githubusercontent.com/llmware-ai/llmware/main/docker-compose-redis-stack.yaml ``` 5. neo4j ```bash curl -o docker-compose.yaml https://raw.githubusercontent.com/llmware-ai/llmware/main/docker-compose-neo4j.yaml docker compose up -d ``` # Configure Vector DB To configure a vector database in llmware, we provide configuration objects in the `configs.py` module to adjust authentication, port/host information, and other common configurations. To use the configuration, the pattern is as follows through simple `get_config` and `set_config` methods: ```python from llmware.configs import MilvusConfig MilvusConfig().set_config("lite", True) from llmware.configs import ChromaDBConfig current_config = ChromaDBConfig().get_config("persistent_path") ChromaDBConfig().set_config("persistent_path", "/new/local/path") ``` Configuration objects are provided for the following vector DB: `MilvusConfig`, `ChromaDBConfig`, `QdrantConfig`, `Neo4jConfig`, `LanceDBConfig`, `PineConeConfig`, `MongoConfig`, `PostgresConfig`. For 'out-of-the-box' testing and development, for most use cases, you will not need to change these configs. Need help or have questions? ============================ Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
--- --- --- --- layout: default title: Whisper CPP parent: Components nav_order: 14 description: overview of the major modules and classes of LLMWare permalink: /components/whisper_cpp --- # Whisper CPP --- llmware has an integrated WhisperCPP backend which enables fast, easy local voice-to-text processing. Whisper is a leading open voice voice-to-text model from OpenAI - https://github.com/openai/whisper WhisperCPP is the implementation of Whisper packaged as a GGML deliverable - https://github.com/ggerganov/whisper.cpp Starting with llmware 0.2.11, we have integrated WhisperCPPModel as a new model class, providing options for direct inference, and coming soon, integration into the Parser for easy text chunking and parsing into a Library with other document types. llmware provides prebuilt shared libraries for WhisperCPP on the following platforms: --Mac M series --Linux x86 (no CUDA) --Linux x86 (with CUDA) - really fast --Windows x86 (only on CPU) currently. We have added three Whisper models to the default model catalog: 1. ggml-base.en.bin - english-only base model 2. ggml-base.bin - multi-lingual base model 3. ggml-small.en-tdrz.bin - this is a 'tiny-diarize' implementation that has been finetuned to identify the speakers and inserts special [_SOLM_] tags to indicate a conversation turn / change of speaker. Main repo: https://github.com/akashmjn/tinydiarize/ Citation: @software{mahajan2023tinydiarize, author = {Mahajan, Akash}, month = {08}, title = {tinydiarize: Minimal extension of Whisper for speaker segmentation with special tokens} url = {https://github.com/akashmjn/tinydiarize}, year = {2023} To use WAV files, there is one additional Python dependency required: --pip install librosa --Note: this has been added to the default requirements.txt and pypy build starting with 0.2.11 To use other popular audio/video file formats, such as MP3, MP4, M4A, etc., then the following dependencies are required: --pip install pydub --ffmpeg library - which can be installed as follows: -- Linux: `sudo apt install ffmpeg' -- Mac: `brew install ffmpeg` -- Windows: direct download and install from ffmpeg ```python """ This example shows how to use llmware provided sample files for testing with WhisperCPP, integrated as of llmware 0.2.11. # examples - "famous_quotes" | "greatest_speeches" | "youtube_demos" | "earnings_calls" -- famous_quotes - approximately 20 small .wav files with clips from old movies and speeches -- greatest_speeches - approximately 60 famous historical speeches in english -- youtube_videos - wav files of ~3 llmware youtube videos -- earnings_calls - wav files of ~4 public company earnings calls (gathered from public investor relations) These sample files are hosted in a non-restricted AWS S3 bucket, and downloaded via the Setup method `load_sample_voice_files`. There are two options: -- small_only = True: only pulls the 'famous_quotes' samples -- small_only = False: pulls all of the samples (requires ~1.9 GB in total) Please note that all of these samples have been pulled from open public domain sources, including the Internet Archives, e.g., https://archive.org. These sample files are being provided solely for the purpose of testing the code scripts below. Please do not use them for any other purpose. To run these examples, please make sure to `pip install librosa` """ import os from llmware.models import ModelCatalog from llmware.gguf_configs import GGUFConfigs from llmware.setup import Setup # optional / to adjust various parameters of the model GGUFConfigs().set_config("whisper_cpp_verbose", "OFF") GGUFConfigs().set_config("whisper_cpp_realtime_display", True) # note: english is default output - change to 'es' | 'fr' | 'de' | 'it' ... GGUFConfigs().set_config("whisper_language", "en") GGUFConfigs().set_config("whisper_remove_segment_markers", True) def sample_files(example="famous_quotes", small_only=False): """ Execute a basic inference on Voice-to-Text model passing a file_path string """ voice_samples = Setup().load_voice_sample_files(small_only=small_only) examples = ["famous_quotes", "greatest_speeches", "youtube_demos", "earnings_calls"] if example not in examples: print("choose one of the following - ", examples) return 0 fp = os.path.join(voice_samples,example) files = os.listdir(fp) # these are the two key lines whisper_base_english = "whisper-cpp-base-english" model = ModelCatalog().load_model(whisper_base_english) for f in files: if f.endswith(".wav"): prompt = os.path.join(fp,f) print(f"\n\nPROCESSING: prompt = {prompt}") response = model.inference(prompt) print("\nllm response: ", response["llm_response"]) print("usage: ", response["usage"]) return 0 if __name__ == "__main__": # pick among the four examples: famous_quotes | greatest_speeches | youtube_demos | earnings_calls sample_files(example="famous_quotes", small_only=False) ``` Need help or have questions? ============================ Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
--- --- --- --- layout: default title: Code contributions parent: Contributing nav_order: 1 permalink: /contributing/code --- # Contributing code One way to contribute to ``llmware`` is by contributing to the code base. We briefly describe some of the important modules of ``llmware`` next, so you can more easily navigate the code base. You may also take a look at our [fast start series from YouTube](https://www.youtube.com/playlist?list=PL1-dn33KwsmD7SB9iSO6vx4ZLRAWea1DB). ## Core modules ### Library The *library* module implements the classes **Library** and **LibraryCatalog**. The **Library** class implements the *library* concept. A *library* is a collection of documents, where a document can be PDF, an image, or an office document. It is responsible for parsing, text chunking, and indexing. In other words, it does the heavy lifting of adding content. In the following, we shortly describe the functions for adding documents to the library. ```python add_file( self, file_path): ``` This method adds one document of any supported type to the library. ```python add_files( self, input_folder_path=None, encoding="utf-8", chunk_size=400, get_images=True,get_tables=True, smart_chunking=2, max_chunk_size=600, table_grid=True, get_header_text=True, table_strategy=1, strip_header=False, verbose_level=2, copy_files_to_library=True): ``` This method adds the documents of one folder to the library. ```python add_website( self, url, get_links=True, max_links=5): ``` This method adds a website, and links from the website, to the library. ```python add_wiki( self, topic_list, target_results=10): ``` This method adds a wikipedia article to the library. ```python add_dialogs( self, input_folder=None): ``` This method adds an AWS dialog transcript to the library. ```python add_image( self, input_folder=None): ``` This method adds images to the libary. ```python add_pdf_by_ocr( self, input_folder=None): ``` This method adds scanned PDFs to the library. ```python add_pdf( self, input_folder=None): ``` This method adds PDFs to the library. ```python add_office( self, input_folder=None): ``` This method adds office documents to the library. ### Embeddings An *embedding* is a vector store and an embedding model. It is responsible for applying an embedding model to a library, storing the embeddings in a vector store, and providing access to the embeddings with natural language queries. We briefly describe the common methods offered by all vector stores below. ```python def create_new_embedding( self, doc_ids=None, batch_size=500): ``` This method creates the embeddings and adds them to the vector store. ```python def search_index( self, query_embedding_vector, sample_count=10): ``` This method searches the vector store given the query vector. ```python def delete_index(self): ``` This method deletes the created vector store index. ### Prompts A *prompt* is an input to model. The prompt is used by the model to generate the response. One important use case is that users want to augment a prompt, or a series of prompts, with additional information. Next, we describe methods for augmenting a prompt with additional information. ```python def add_source_new_query( self, library, query=None, query_type="semantic", result_count=10): ``` This method adds the results of the ``query`` to the prompt. ```python def add_source_query_results( self, query_results): ``` This method adds previous results from a query as a source to the prompt. ```python def add_source_library( self, library_name): ``` This method adds an entire library to the prompt. We recommend that you only use this when the library is sufficiently small. ```python def add_source_wikipedia( self, topic, article_count=3, query=None): ``` This method adds wikipedia articles to the prompt based on the provided ``topic``. ```python def add_source_yahoo_finance( self, ticker=None, key_list=None): ``` This method adds a Yahoo finance ticker to the prompt. ```python def add_source_knowledge_graph( self, library, query): ``` This method adds the summary output elements from a knowledge graph based on the provided ``query``. Please note that this method is experimental, i.e. unstable, and is subject to change dramatically in each new version. ```python def add_source_website( self, url, query=None): ``` This method adds the website pointed to by the ``url`` to the prompt. ```python def add_source_document( self, input_fp, input_fn, query=None): ``` This method adds a document, or documents, of any supported type to the prompt. If documents are added, then the ``query`` parameter can be used to filter the documents. ```python def add_source_last_interaction_step( self): ``` This method adds the last interaction to the prompt. The use case for this is to enable interactive dialog, i.e. chatting. ### Model Catalog A *model catalog* is a collection of models. In the following, we briefly describe the methods for adding new models to the catalog. ```python def register_new_hf_generative_model( self, hf_model_name=None, context_window=2048, prompt_wrapper="", display_name=None, temperature=0.3, trailing_space="", link=""): ``` This method adds a new generative model from hugging face. Users can therefore add models from hugging face that are unsupported currently. ```python def register_sentence_transformer_model( self, model_name, embedding_dims, context_window, display_name=None, link=""): ``` This method adds a new sentence transformer. ```python def register_gguf_model( self, model_name, gguf_model_repo, gguf_model_file_name, prompt_wrapper=None, eos_token_id=0, display_name=None, trailing_space="", temperature=0.3, context_window=2048, instruction_following=True): ``` This method adds a new GGUF model. ```python def register_open_chat_model( cls, model_name, api_base=None, model_type="chat", display_name=None, context_window=4096, instruction_following=True, prompt_wrapper="", temperature=0.5): ``` This method adds any chat model that is available through a web API, e.g. a chat model that is available locally via localhost. ```python def register_ollama_model( cls, model_name, host="localhost", port=11434, model_type="chat", raw=False, stream=False, display_name=None, context_window=4096, instruction_following=True, prompt_wrapper="", temperature=0.5): ``` This method adds an OLLama model that is available through a web API. The method is similar to the ``register_open_chat_model`` method above. ### Categories of code contributions #### New or Enhancement to existing Features You want to submit a code contribution that adds a new feature or enhances an existing one? Then the best way to start is by opening a discussion in our [GitHub discussions](https://github.com/llmware-ai/llmware/discussions). Please do this before you work on it, so you do not put effort into it just to realise after submission that it will not be merged. #### Bugs If you encounter a bug, you can - File an issue about the bug. - Provide a self-contained minimal example that reproduces the bug, which is extremely important. - Provide possible solutions for the bug. - Submit a pull a request to fix the bug. We encourage you to read [How to create a Minimal, Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example) from the Stackoverflow helpcenter, and the tag description of [self-container](https://stackoverflow.com/tags/self-contained/info), also from Stackoverflow. --- --- layout: default title: Contributing nav_order: 7 has_children: true description: llmware contributions. permalink: /contributing --- # Contributing to llmware {: .note} > The contributions to `llmware` are governed by our [Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md). {: .warning} > Have you found a security issue? Then please jump to [Security Vulnerabilities](#security-vulnerabilities). On this page, we provide information ``llmware`` contributions. There are **two ways** on how you can contribute. The first is by making **code contributions**, and the second by making contributions to the **documentation**. Please look at our [contribution suggestions](#how-can-you-contribute) if you need inspiration, or take a look at [open issues](#open-issues). Contributions to `llmware` are welcome from everyone. Our goal is to make the process simple, transparent, and straightforward. We are happy to receive suggestions on how the process can be improved. ## How can you contribute? {: .note} > If you have never contributed before look for issues with the tag [``good first issue``](https://github.com/llmware-ai/llmware/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22). The most usual ways to contribute is to add new features, fix bugs, add tests, or add documentation. You can visit the [issues](https://github.com/llmware-ai/llmware/issues) site of the project and search for tags such as ``bug``, ``enhancement``, ``documentation``, or ``test``. Here is a non exhaustive list of contributions you can make. 1. Code refactoring 2. Add new text data bases 3. Add new vector data bases 4. Fix bugs 5. Add usage examples (see for example the issues [jupyter notebook - more examples and better support](https://github.com/llmware-ai/llmware/issues/508) and [google colab examples and start up scripts](https://github.com/llmware-ai/llmware/issues/507)) 6. Add experimental features 7. Improve code quality 8. Improve documentation in the docs (what you are reading right now) 9. Improve documentation by adding or updating docstrings in modules, classes, methods, or functions (see for example [Add docstrings](https://github.com/llmware-ai/llmware/issues/219)) 10. Improve test coverage 11. Answer questions in our [Discord channel](https://discord.gg/MhZn5Nc39h), especially in the [technical support forum](https://discord.com/channels/1179245642770559067/1218498778915672194) 12. Post projects in which you use ``llmware`` in our Discord forum [made with llmware](https://discord.com/channels/1179245642770559067/1218567269471486012), ideially with a link to a public GitHub repository ## Open Issues If you're interested in existing issues, you can - Look for issues, if you are a new to the project, look for issues with the `good first issue` label. - Provide answers for questions in our [GitHub discussions](https://github.com/llmware-ai/llmware/discussions) - Provide help for bug or enhancement issues. - Ask questions, reproduce the issues, or provide solutions. - Pull a request to fix the issue. ## Security Vulnerabilities **If you believe you've found a security vulnerability, then please _do not_ submit an issue ticket or pull request or otherwise publicly disclose the issue.** Please follow the process at [Reporting a Vulnerability](https://github.com/llmware-ai/llmware/blob/main/Security.md) ## GitHub workflow We follow the [``fork-and-pull``](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request-from-a-fork) Git workflow. 1. [Fork](https://docs.github.com/en/github/getting-started-with-github/fork-a-repo) the repository on GitHub. 2. Clone your fork to your local machine with `git clone git@github.com:/llmware.git`. 3. Create a branch with `git checkout -b my-topic-branch`. 4. Run the test suite by navigating to the tests/ folder and running ```./run-tests.py -s``` to ensure there are no failures 5. [Commit](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/committing-changes-to-a-pull-request-branch-created-from-a-fork) changes to your own branch, then push to GitHub with `git push origin my-topic-branch`. 6. Submit a [pull request](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests) so that we can review your changes. Remember to [synchronize your forked repository](https://docs.github.com/en/github/getting-started-with-github/fork-a-repo#keep-your-fork-synced) _before_ submitting proposed changes upstream. If you have an existing local repository, please update it before you start, to minimize the chance of merge conflicts. ```shell git remote add upstream git@github.com:llmware-ai/llmware.git git fetch upstream git checkout upstream/main -b my-topic-branch ``` ## Community Questions and discussions are welcome in any shape or form. Please fell free to join our community on our discord channel, on which we are active daily. You are also welcome if you just want to post an idea! - [Discord Channel](https://discord.gg/MhZn5Nc39h) - [GitHub discussions](https://github.com/llmware-ai/llmware/discussions) --- --- layout: default title: Documentation contributions parent: Contributing nav_order: 2 permalink: contributing/documentation --- # Contributing documentation One way to contribute to ``llmware`` is by contributing documentation. There are **two ways** to contribute to the ``llmware`` documentation. The first is via **docstrings in the code**, and the second is **the docs**, which is what you are *currently reading*. In both areas, you can contribute in a lot of ways. Here is a non exhaustive list of these ways for the docstrings which also apply to the docs. 1. Add documentation (e.g., adding a docstring to a function) 2. Update documentation (e.g., update a docstring that is not in sync with the code) 3. Simplify documentation (e.g., formulate a docstring more clearly) 4. Enhance documentation (e.g., add more examples to a docstring or fix typos) ## Docstrings **Docstrings** document the code within the code, which allows programmers to easily have a look while they are programming. For an exmaple, have a look at [this docstring](https://github.com/llmware-ai/llmware/blob/c9e12a7a150162986622738e127c37ac70f31cd6/llmware/agents.py#L27-L66) which documents the ``LLMfx`` class. We follow the docstring style of **numpy**, for which you can find an example [here](https://github.com/numpy/numpydoc/blob/main/doc/example.py) and [here](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_numpy.html). Please be sure to follow the conventions and go over your pull request before you submit it. ## Docs {: .note} > All commands are executed from the `docs` sub-directory. Contributing to this documentation is extremely important as many users will refer to it. If you plan to contribute to the docs, we recommend that you locally install `jekyll` so you can test your changes locally. We also recommend, that you install `jekyll` into a a ruby enviroment so it does not interfere with any other installations you might have. We recommend that you install `rbenv` and `rvm` to manage your ruby installation. `rbenv` is a tool that mangages different ruby versions, similar to what `conda` does for `python`. Please [install rbenv](https://github.com/rbenv/rbenv?tab=readme-ov-file#installation) following their instructions, and the same for [install rvm](https://github.com/rvm/rvm?tab=readme-ov-file#installing-rvm). We recommend that you install a ruby version `>=3.0`. After having installed an isolated ruby version, you have to install the dependencies to build the docs locally. The `docs` directory has a `Gemfile` which specifies the dependencies. You can hence simply navigate to it and use the `bundle install` command. ```bash bundle install ``` You should now be able to build and serve the documentation locally. To do this, simply to the following. ```bash bundle exec jekyll server --livereload --verbose ``` In the browser of your choice, you can then go to `http://127.0.0.1:4000/` and you will be served the documentation, which is re-build and re-loaded after any change to the `docs`. ``jekyll`` will create a ``_site`` directory where it saves the created files, please **never commit any files from the \_site directory**! ## Open Issues If you're interested in existing issues, you can - Look for issues with the `good first issue` and `documentation` label as a good place to get started. - Provide answers for questions in our [GitHub discussions](https://github.com/llmware-ai/llmware/discussions) - Provide help for bug or enhancement issues. - Ask questions, reproduce the issues, or provide solutions. - Pull a request to fix the issue. --- --- layout: default title: Agents parent: Examples nav_order: 2 description: overview of the major modules and classes of LLMWare permalink: /examples/agents --- # Agents 🚀 Start Building Multi-Model Agents Locally on a Laptop 🚀 =============== **What is a SLIM?** **SLIMs** are **S**tructured **L**anguage **I**nstruction **M**odels, which are small, specialized 1-3B parameter LLMs, finetuned to generate structured outputs (Python dictionaries and lists, JSON and SQL) that can be handled programmatically, and stacked together in multi-step, multi-model Agent workflows - all running on a local CPU. **New SLIMS Just released** - check out slim-extract, slim-summarize, slim-xsum, slim-sa-ner, slim-boolean and slim-tags-3b **Check out the new examples below marked with ⭐** 🔥🔥🔥 Web Services & Function Calls ([code](web_services_slim_fx.py)) 🔥🔥🔥 **Check out the Intro videos** [SLIM Intro Video](https://www.youtube.com/watch?v=cQfdaTcmBpY) There are 16 SLIM models, each delivered in two packages - a Pytorch/Huggingface FP16 model, and a quantized "tool" designed for fast inference on a CPU, using LLMWare's embedded GGUF inference engine. In most cases, we would recommend that you start with the "tools" version of the models. **Getting Started** We have several ready-to-run examples in this repository: | Example | Detail | |-----------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------| | 1. Getting Started with SLIM Models ([code](slims-getting-started.py) / [video](https://www.youtube.com/watch?v=aWZFrTDmMPc&t=196s)) | Install the models and run hello world tests to see the models in action. | | 2. Getting Started with Function-Calling Agent ([code](agent-llmfx-getting-started.py) / [video](https://www.youtube.com/watch?v=cQfdaTcmBpY)) | Generate a Structured Report with LLMfx | | 3. Multi-step Complex Analysis with Agent ([code](agent-multistep-analysis.py) / [video](https://www.youtube.com/watch?v=y4WvwHqRR60)) | Delivering Complex Research Analysis with SLIM Agents | | | 4. Document Clustering ([code](document-clustering.py)) | Multi-faceted automated document analysis with Topics, Tags and NER | | 5. Two-Step NER Retrieval ([code](ner-retrieval.py)) | Using NER to extract name, and then using as basis for retrieval. | | | 6. Using Sentiment Analysis ([code](sentiment-analysis.py)) | Using sentiment analysis on earnings transcripts and a 'if...then' condition | | 7. Text2SQL - Intro ([code](text2sql-getting-started-1.py)) | Getting Started with SLIM-SQL-TOOL and Basic Text2SQL Inference | | | 8. Text2SQL - E2E ([code](text2sql-end-to-end-2.py)) | End-to-End Natural Langugage Query to SQL DB Query | | | 9. Text2SQL - MultiStep ([code](text2sql-multistep-example-3.py)) | Extract a customer name using NER and use in a Text2SQL query | | 10. ⭐ Web Services & Function Calls ([code](web_services_slim_fx.py)) | Generate 30 key financial analysis with SLIM function calls and web services | | 11. ⭐ Yes-No Questions with Explanations ([code](using_slim_boolean_model.py)) | Analyze earnings releases with SLIM Boolean | | 12. ⭐ Extracting Revenue Growth ([code](using_slim_extract_model.py)) | Extract revenue growth from earnings releases with SLIM Extract | | 13. ⭐ Summary as a Function Call ([code](using_slim_summary.py)) | Simple Summarization as a Function Call with List Length Parameters | | 14. ⭐ Handling Not Found Extracts ([code](not_found_extract_with_lookup.py)) | Multi-step Lookup strategy and handling not-found answers | | 15. ⭐ Extract + Lookup ([code](custom_extract_and_lookup.py)) | Extract Named Entity information and use for lookups with SLIM Extract | | 16. ⭐ Headline/Title as XSUM Function Call ([code](using_slim_xsum.py)) | eXtreme Summarization (XSUM) with SLIM XSUM | For information on all of the SLIM models, check out [LLMWare SLIM Model Collection](https://www.huggingface.co/llmware/). **Models List** If you would like more information about any of the SLIM models, please check out their model card: - extract - extract custom keys - [slim-extract](https://www.huggingface.co/llmware/slim-extract) & [slim-extract-tool](https://www.huggingface.co/llmware/slim-extract-tool) - summary - summarize function call - [slim-summary](https://www.huggingface.co/llmware/slim-summary) & [slim-summary-tool](https://www.huggingface.co/llmware/slim-summary-tool) - xsum - title/headline function call - [slim-xsum](https://www.huggingface.co/llmware/slim-xsum) & [slim-xsum-tool](https://www.huggingface.co/llmware/slim-xsum-tool) - ner - extract named entities - [slim-ner](https://www.huggingface.co/llmware/slim-ner) & [slim-ner-tool](https://www.huggingface.co/llmware/slim-ner-tool) - sentiment - evaluate sentiment - [slim-sentiment](https://www.huggingface.co/slim-sentiment) & [slim-sentiment-tool](https://www.huggingface.co/llmware/slim-sentiment-tool) - topics - generate topic - [slim-topics](https://www.huggingface.co/slim-topics) & [slim-topics-tool](https://www.huggingface.co/llmware/slim-topics-tool) - sa-ner - combo model (sentiment + named entities) - [slim-sa-ner](https://www.huggingface.co/slim-sa-ner) & [slim-sa-ner-tool](https://www.huggingface.co/llmware/slim-sa-ner-tool) - boolean - provides a yes/no output with explanation - [slim-boolean](https://www.huggingface.co/slim-boolean) & [slim-boolean-tool](https://www.huggingface.com/llmware/slim-boolean-tool) - ratings - apply 1 (low) - 5 (high) rating - [slim-ratings](https://www.huggingface.co/slim-ratings) & [slim-ratings-tool](https://www.huggingface.co/llmware/slim-ratings-tool) - emotions - assess emotions - [slim-emotions](https://www.huggingface.co/slim-emotions) & [slim-emotions-tool](https://www.huggingface.co/llmware/slim-emotions-tool) - tags - auto-generate list of tags - [slim-tags](https://www.huggingface.co/slim-tags) & [slim-tags-tool](https://www.huggingface.co/llmware/slim-tags-tool) - tags-3b - enhanced auto-generation tagging model - [slim-tags-3b](https://www.huggingface.com/slim-tags-3b) & [slim-tags-3b-tool](https://www.huggingface.co/llmware/slim-tags-3b-tool) - intent - identify intent - [slim-intent](https://www.huggingface.co/slim-intent) & [slim-intent-tool](https://www.huggingface.co/llmware/slim-intent-tool) - category - high-level category - [slim-category](https://www.huggingface.co/slim-category) & [slim-category-tool](https://wwww.huggingface.co/llmware/slim-category-tool) - nli - assess if evidence supports conclusion - [slim-nli](https://www.huggingface.co/slim-nli) & [slim-nli-tool](https://www.huggingface.co/llmware/slim-nli-tool) - sql - convert text into sql - [slim-sql](https://www.huggingface.co/slim-sql) & [slim-sql-tool](https://www.huggingface.co/llmware/slim-sql-tool) You may also want to check out these quantized 'answer' tools, which work well in conjunction with SLIMs for question-answer and summarization: - bling-stablelm-3b-tool - 3b quantized RAG model - [bling-stablelm-3b-gguf](https://www.huggingface.co/llmware/bling-stablelm-3b-gguf) - bling-answer-tool - 1b quantized RAG model - [bling-answer-tool](https://www.huggingface.co/llmware/bling-answer-tool) - dragon-yi-answer-tool - 6b quantized RAG model - [dragon-yi-answer-tool](https://www.huggingface.co/llmware/dragon-yi-answer-tool) - dragon-mistral-answer-tool - 7b quantized RAG model - [dragon-mistral-answer-tool](https://www.huggingface.co/llmware/dragon-mistral-answer-tool) - dragon-llama-answer-tool - 7b quantized RAG model - [dragon-llama-answer-tool](https://www.huggingface.co/llmware/dragon-llama-answer-tool) **Set up** No special setup for SLIMs is required other than to install llmware >=0.2.6, e.g., `pip3 install llmware`. **Platforms:** - Mac M1, Mac x86, Windows, Linux (Ubuntu 22 preferred, supported on Ubuntu 20 +) - RAM: 16 GB minimum - Python 3.9, 3.10, 3.11 (note: not supported on 3.12 yet) - llmware >= 0.2.6 version ### **Let's get started! 🚀** --- --- layout: default title: Datasets parent: Examples nav_order: 10 description: overview of the major modules and classes of LLMWare permalink: /examples/datasets --- # Datasets - Introduction by Examples llmware provides powerful capabilities to transform raw unstructured information into various model-ready datasets. ```python import os import json from llmware.library import Library from llmware.setup import Setup from llmware.dataset_tools import Datasets from llmware.retrieval import Query def build_and_use_dataset(library_name): # Setup a library and build a knowledge graph. Datasets will use the data in the knowledge graph print (f"\n > Creating library {library_name}...") library = Library().create_new_library(library_name) sample_files_path = Setup().load_sample_files() library.add_files(os.path.join(sample_files_path,"SmallLibrary")) library.generate_knowledge_graph() # Create a Datasets object from library datasets = Datasets(library) # Build a basic dataset useful for industry domain adaptation for fine-tuning embedding models print (f"\n > Building basic text dataset...") basic_embedding_dataset = datasets.build_text_ds(min_tokens=500, max_tokens=1000) dataset_location = os.path.join(library.dataset_path, basic_embedding_dataset["ds_id"]) print (f"\n > Dataset:") print (f"(Files referenced below are found in {dataset_location})") print (f"\n{json.dumps(basic_embedding_dataset, indent=2)}") sample = datasets.get_dataset_sample(datasets.current_ds_name) print (f"\nRandom sample from the dataset:\n{json.dumps(sample, indent=2)}") # Other Dataset Generation and Usage Examples: # Build a simple self-supervised generative dataset- extracts text and splits into 'text' & 'completion' # Several generative "prompt_wrappers" are available - chat_gpt | alpaca | basic_generative_completion_dataset = datasets.build_gen_ds_targeted_text_completion(prompt_wrapper="alpaca") # Build a generative self-supervised training sets created by pairing 'header_text' with 'text' xsum_generative_completion_dataset = datasets.build_gen_ds_headline_text_xsum(prompt_wrapper="human_bot") topic_prompter_dataset = datasets.build_gen_ds_headline_topic_prompter(prompt_wrapper="chat_gpt") # Filter a library by a key term as part of building the dataset filtered_dataset = datasets.build_text_ds(query="agreement", filter_dict={"master_index":1}) # Pass a set of query results to create a dataset from those results only query_results = Query(library=library).query("africa") query_filtered_dataset = datasets.build_text_ds(min_tokens=250,max_tokens=600, qr=query_results) return 0 ``` For more examples, see the [datasets example]((https://www.github.com/llmware-ai/llmware/tree/main/examples/Datasets/) in the main repo. Check back often - we are updating these examples regularly - and many of these examples have companion videos as well. # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
---
--- --- --- layout: default title: Embedding parent: Examples nav_order: 5 description: overview of the major modules and classes of LLMWare permalink: /examples/embedding --- # Embedding - Introduction by Examples We introduce ``llmware`` through self-contained examples. ```python """ This example is a fast start with Milvus Lite, which is a 'no-install' file-based version of Milvus, intended for rapid prototyping. A couple of key points to note: -- Platform - per Milvus docs, Milvus Lite is designed for Mac and Linux (not on Windows currently) -- PyMilvus - need to `pip install pymilvus>=2.4.2` -- within LLMWare: set MilvusConfig().set_config("lite", True) """ import os from llmware.library import Library from llmware.retrieval import Query from llmware.setup import Setup from llmware.status import Status from llmware.models import ModelCatalog from llmware.configs import LLMWareConfig, MilvusConfig from importlib import util if not util.find_spec("pymilvus"): print("\nto run this example with pymilvus, you need to install pymilvus: pip3 install pymilvus>=2.4.2") def setup_library(library_name): """ Note: this setup_library method is provided to enable a self-contained example to create a test library """ # Step 1 - Create library which is the main 'organizing construct' in llmware print ("\nupdate: Creating library: {}".format(library_name)) library = Library().create_new_library(library_name) # check the embedding status 'before' installing the embedding embedding_record = library.get_embedding_status() print("embedding record - before embedding ", embedding_record) # Step 2 - Pull down the sample files from S3 through the .load_sample_files() command # --note: if you need to refresh the sample files, set 'over_write=True' print ("update: Downloading Sample Files") sample_files_path = Setup().load_sample_files(over_write=False) # Step 3 - point ".add_files" method to the folder of documents that was just created # this method parses the documents, text chunks, and captures in database print("update: Parsing and Text Indexing Files") library.add_files(input_folder_path=os.path.join(sample_files_path, "Agreements"), chunk_size=400, max_chunk_size=600, smart_chunking=1) return library def install_vector_embeddings(library, embedding_model_name): """ This method is the core example of installing an embedding on a library. -- two inputs - (1) a pre-created library object and (2) the name of an embedding model """ library_name = library.library_name vector_db = LLMWareConfig().get_vector_db() print(f"\nupdate: Starting the Embedding: " f"library - {library_name} - " f"vector_db - {vector_db} - " f"model - {embedding_model_name}") # *** this is the one key line of code to create the embedding *** library.install_new_embedding(embedding_model_name=embedding_model, vector_db=vector_db,batch_size=100) # note: for using llmware as part of a larger application, you can check the real-time status by polling Status() # --both the EmbeddingHandler and Parsers write to Status() at intervals while processing update = Status().get_embedding_status(library_name, embedding_model) print("update: Embeddings Complete - Status() check at end of embedding - ", update) # Start using the new vector embeddings with Query sample_query = "incentive compensation" print("\n\nupdate: Run a sample semantic/vector query: {}".format(sample_query)) # queries are constructed by creating a Query object, and passing a library as input query_results = Query(library).semantic_query(sample_query, result_count=20) for i, entries in enumerate(query_results): # each query result is a dictionary with many useful keys text = entries["text"] document_source = entries["file_source"] page_num = entries["page_num"] vector_distance = entries["distance"] # to see all of the dictionary keys returned, uncomment the line below # print("update: query_results - all - ", i, entries) # for display purposes only, we will only show the first 125 characters of the text if len(text) > 125: text = text[0:125] + " ... " print("\nupdate: query results - {} - document - {} - page num - {} - distance - {} " .format( i, document_source, page_num, vector_distance)) print("update: text sample - ", text) # lets take a look at the library embedding status again at the end to confirm embeddings were created embedding_record = library.get_embedding_status() print("\nupdate: embedding record - ", embedding_record) return 0 if __name__ == "__main__": # Fast Start configuration - will use no-install embedded sqlite # -- if you have installed Mongo or Postgres, then change the .set_active_db accordingly LLMWareConfig().set_active_db("sqlite") # set the "lite" flag in MilvusConfig to True -> to use server version, set to False (which is default) MilvusConfig().set_config("lite", True) LLMWareConfig().set_vector_db("milvus") # Step 1 - create library library = setup_library("ex2_milvus_lite") # Step 2 - Select any embedding model in the LLMWare catalog # to see a list of the embedding models supported, uncomment the line below and print the list embedding_models = ModelCatalog().list_embedding_models() # for i, models in enumerate(embedding_models): # print("embedding models: ", i, models) # for this first embedding, we will use a very popular and fast sentence transformer embedding_model = "mini-lm-sbert" # note: if you want to swap out "mini-lm-sbert" for Open AI 'text-embedding-ada-002', uncomment these lines: # embedding_model = "text-embedding-ada-002" # os.environ["USER_MANAGED_OPENAI_API_KEY"] = "" # run the core script install_vector_embeddings(library, embedding_model) ``` For more examples, see the [embedding examples]((https://www.github.com/llmware-ai/llmware/tree/main/examples/Embedding/) in the main repo. Check back often - we are updating these examples regularly - and many of these examples have companion videos as well. # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
---
--- --- --- layout: default title: Examples nav_order: 5 has_children: true description: examples, recipes and use cases permalink: /examples --- llmware offers a wide range of examples to cover the lifecycle of building RAG and Agent based applications using small language models: - [Parsing examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing) - ~14 stand-alone parsing examples for all common document types, including options for parsing in memory, outputting to JSON, parsing custom configured CSV and JSON files, running OCR on embedded images found in documents, table extraction, image extraction, text chunking, zip files, and web sources. - [Embedding examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Embedding) - ~15 stand-alone embedding examples to show how to use ~10 different vector databases and wide range of leading open source embedding models (including sentence transformers). - [Retrieval examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Retrieval) - ~10 stand-alone examples illustrating different query and retrieval techniques - semantic queries, text queries, document filters, page filters, 'hybrid' queries, author search, using query state, and generating bibliographies. - [Dataset examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Datasets) - ~5 stand-alone examples to show 'next steps' of how to leverage a Library to re-package content into various datasets and automated NLP analytics. - [Fast start example #1-Parsing](https://github.com/llmware-ai/llmware/blob/main/fast_start/rag/example-1-create_first_library.py) - shows the basics of parsing. - [Fast start example #2-Embedding](https://github.com/llmware-ai/llmware/blob/main/fast_start/rag/example-2-build_embeddings.py) - shows the basics of building embeddings. - [CustomTable examples](https://github.com/llmware-ai/llmware/tree/main/examples/Structured_Tables) - ~5 examples to start building structured tables that can be used in conjunction with LLM-based workflows. - [Models examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Models) - ~20 examples showing a wide range of different model inferences and use cases, including the ability to integrate Ollama models, OpenChat (e.g., LMStudio) models, using LLama-3 and Phi-3, bringing your own models into the ModelCatalog, and configuring sampling settings. - [Prompts examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Prompts) - ~5 examples that illustrate how to use Prompt as an integrated workflow for integrating knowledge sources, managing prompt history, and applying fact-checking. - [SLIM-Agents examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/SLIM-Agents) - ~20 examples showing how to build multi-model, multi-step Agent processes using locally-running SLIM function calling models. - [Fast start example #3-Prompts and Models](https://github.com/llmware-ai/llmware/blob/main/fast_start/rag/example-3-prompts_and_models.py) - getting started with model inference. --- --- layout: default title: Introduction by Examples parent: Examples nav_order: 9 permalink: /examples/getting_started --- # Introduction by Examples We introduce ``llmware`` through self-contained examples. # Your first library and query {: .note } > The code here is a modified version from [example-1-create_first_library.py](https://github.com/llmware-ai/llmware/blob/main/fast_start/example-1-create_first_library.py). > The adjustments are made to ease understanding for this post. In this introduction, we will walk through the steps of creating a **library**. To create a ``library`` in ``llmware`` we have to instantiate a ``library`` object and call the ``add_files`` method, which will parse the files, chunk up the text and also index it. We will also download the samples files we provide, which can be used for any experimentation you might want to do. **Configuring llmware** Before we get started, we can influence the configuration of ``llmware``. For example, we can decide on which **text collection** data base to use, and on the logging level. By default, ``llmware`` uses MongoDB as the text collection data base and has a ``debug_mode`` level of ``0``. This means that by default, ``llmware`` will show the status manager and print errors. The status manager is useful for large parsing jobs. In this ``library`` introduction, we will change the text collection data base as well as the ``debug_mode``. As the text collection data base, we will choose ``sqlite``. And we will change the ``debug_mode`` to ``2``, which will show the file name that is being parsed, i.e. a file-by-file progress. ```python from llmware.configs import LLMWareConfig LLMWareConfig().set_active_db("sqlite") LLMWareConfig().set_config("debug_mode", 2) ``` **Downloading sample files** We start by downloading the sample files we need. ``llmware`` provides a set of sample files which we use throughout our examples. The following code snippet downloads these sample files, and in doing so creates the directories *Agreements*, *Invoices*, *UN-Resolutions-500*, *SmallLibrary*, *FinDocs*, and *AgreementsLarge*. If you want to get the newest version of the sample files, you can set ``over_write=True``. However, we encourage you to try it out with your own files once you are comfortable enough with ``llmware``. ```python from llmware.setup import Setup sample_files_path = Setup().load_sample_files(over_write=False) ``` ``sample_files_path`` is the path where the files are stores. Assume that your use name is ``foo``, then on Linux the path would be ``'/home/foo/llmware_data/sample_files'.`` **Creating a library** Now that we have data, we can start to create our library. In ``llmware``, a **library** is a collection of unstructured data. Currently, ``llmware`` supports *text* and *images*. The following code creates an empty ``library`` with the name ``my_llmware_library``. ```python from llmware.library import Library library = Library().create_new_library('my_llmware_library') ``` **Adding files to a library** Now that we have created a ``library``, we are ready to *add files* to it. Currently, the ``add_files`` method supports pdf, pptx, docx, xlsx, csv, md, txt, json, wav, and zip, jpg, and png. The method will automatically choose the correct parser, based on the file extension. ```python library.add_files('/home/foo/llmware_data/sample_files/Agreements') ``` **The library card** A ``library`` keeps inventory of its files, similar to a good librarian. We do this with a *library card*. At the moment of this writing, a library card has the keys _id, library_name, embedding, knowledge_graph, unique_doc_id, documents, blocks, images, pages, tables, and account_name. ```python updated_library_card = library.get_library_card() doc_count = updated_library_card["documents"] block_count = updated_library_card["blocks"] library_card.keys() ``` You can also get where the library is stored via the ``library_main_path`` attribute. Again, assuming your user name is *foo* and you are on a Linux system, then the ``library_path`` is ``'/home/foo/llmware_data/accounts/llmware/my_lib'``. ```python library.library_main_path ``` **Querying a library** Finally, we are ready to execute a query against our library. Remember that the text is indexed automatically when we add it to the library. The result of a ``Query`` is a list of dictionaries, where one dictionary is one result. A result dictionary has a wide range of useful keys. A few important keys in the dictionary are *text*, *file_source*, *page_num*, *doc_ID*, *block_ID*, and *matches*. In the following, we query the library for the base salary, return the first ten results, and iterate over the results. ```python query_results = Query(library).text_query('base salary', result_count=10) for query_result in query_results: text = query_result["text"] file_source = query_result["file_source"] page_number = query_result["page_num"] doc_id = query_result["doc_ID"] block_id = query_result["block_ID"] matches = query_result["matches"] ``` You can take a look at all the keys that are returned by calling ``keys()``. ```python query_results[0].keys() ``` --- --- layout: default title: Models parent: Examples nav_order: 3 description: overview of the major modules and classes of LLMWare permalink: /examples/models --- # Models We introduce ``llmware`` through self-contained examples. ```python """ This example demonstrates prompting local BLING models with provided context - easy to select among different BLING models between 1B - 4B, including both Pytorch versions and GGUF quantized versions, and to swap out the hello_world questions with your own test set. NOTE: if you are running on a CPU with limited memory (e.g., <16 GB of RAM), we would recommend sticking to the 1B parameter models, or using the quantized GGUF versions. You may get out-of-memory errors and/or very slow performance with ~3B parameter Pytorch models. Even with 16 GB+ of RAM, the 3B Pytorch models should run but will be slow (without GPU acceleration). """ import time from llmware.prompts import Prompt def hello_world_questions(): test_list = [ {"query": "What is the total amount of the invoice?", "answer": "$22,500.00", "context": "Services Vendor Inc. \n100 Elm Street Pleasantville, NY \nTO Alpha Inc. 5900 1st Street " "Los Angeles, CA \nDescription Front End Engineering Service $5000.00 \n Back End Engineering" " Service $7500.00 \n Quality Assurance Manager $10,000.00 \n Total Amount $22,500.00 \n" "Make all checks payable to Services Vendor Inc. Payment is due within 30 days." "If you have any questions concerning this invoice, contact Bia Hermes. " "THANK YOU FOR YOUR BUSINESS! INVOICE INVOICE # 0001 DATE 01/01/2022 FOR Alpha Project P.O. # 1000"}, {"query": "What was the amount of the trade surplus?", "answer": "62.4 billion yen ($416.6 million)", "context": "Japan’s September trade balance swings into surplus, surprising expectations" "Japan recorded a trade surplus of 62.4 billion yen ($416.6 million) for September, " "beating expectations from economists polled by Reuters for a trade deficit of 42.5 " "billion yen. Data from Japan’s customs agency revealed that exports in September " "increased 4.3% year on year, while imports slid 16.3% compared to the same period " "last year. According to FactSet, exports to Asia fell for the ninth straight month, " "which reflected ongoing China weakness. Exports were supported by shipments to " "Western markets, FactSet added. — Lim Hui Jie"}, {"query": "What was Microsoft's revenue in the 3rd quarter?", "answer": "$52.9 billion", "context": "Microsoft Cloud Strength Drives Third Quarter Results \nREDMOND, Wash. — April 25, 2023 — " "Microsoft Corp. today announced the following results for the quarter ended March 31, 2023," " as compared to the corresponding period of last fiscal year:\n· Revenue was $52.9 billion" " and increased 7% (up 10% in constant currency)\n· Operating income was $22.4 billion " "and increased 10% (up 15% in constant currency)\n· Net income was $18.3 billion and " "increased 9% (up 14% in constant currency)\n· Diluted earnings per share was $2.45 " "and increased 10% (up 14% in constant currency).\n"}, {"query": "When did the LISP machine market collapse?", "answer": "1987.", "context": "The attendees became the leaders of AI research in the 1960s." " They and their students produced programs that the press described as 'astonishing': " "computers were learning checkers strategies, solving word problems in algebra, " "proving logical theorems and speaking English. By the middle of the 1960s, research in " "the U.S. was heavily funded by the Department of Defense and laboratories had been " "established around the world. Herbert Simon predicted, 'machines will be capable, " "within twenty years, of doing any work a man can do'. Marvin Minsky agreed, writing, " "'within a generation ... the problem of creating 'artificial intelligence' will " "substantially be solved'. They had, however, underestimated the difficulty of the problem. " "Both the U.S. and British governments cut off exploratory research in response " "to the criticism of Sir James Lighthill and ongoing pressure from the US Congress " "to fund more productive projects. Minsky's and Papert's book Perceptrons was understood " "as proving that artificial neural networks approach would never be useful for solving " "real-world tasks, thus discrediting the approach altogether. The 'AI winter', a period " "when obtaining funding for AI projects was difficult, followed. In the early 1980s, " "AI research was revived by the commercial success of expert systems, a form of AI " "program that simulated the knowledge and analytical skills of human experts. By 1985, " "the market for AI had reached over a billion dollars. At the same time, Japan's fifth " "generation computer project inspired the U.S. and British governments to restore funding " "for academic research. However, beginning with the collapse of the Lisp Machine market " "in 1987, AI once again fell into disrepute, and a second, longer-lasting winter began."}, {"query": "When will employment start?", "answer": "April 16, 2012.", "context": "THIS EXECUTIVE EMPLOYMENT AGREEMENT (this “Agreement”) is entered " "into this 2nd day of April, 2012, by and between Aphrodite Apollo " "(“Executive”) and TestCo Software, Inc. (the “Company” or “Employer”), " "and shall become effective upon Executive’s commencement of employment " "(the “Effective Date”) which is expected to commence on April 16, 2012. " "The Company and Executive agree that unless Executive has commenced " "employment with the Company as of April 16, 2012 (or such later date as " "agreed by each of the Company and Executive) this Agreement shall be " "null and void and of no further effect."}, {"query": "What is the current rate on 10-year treasuries?", "answer": "4.58%", "context": "Stocks rallied Friday even after the release of stronger-than-expected U.S. jobs data " "and a major increase in Treasury yields. The Dow Jones Industrial Average gained 195.12 points, " "or 0.76%, to close at 31,419.58. The S&P 500 added 1.59% at 4,008.50. The tech-heavy " "Nasdaq Composite rose 1.35%, closing at 12,299.68. The U.S. economy added 438,000 jobs in " "August, the Labor Department said. Economists polled by Dow Jones expected 273,000 " "jobs. However, wages rose less than expected last month. Stocks posted a stunning " "turnaround on Friday, after initially falling on the stronger-than-expected jobs report. " "At its session low, the Dow had fallen as much as 198 points; it surged by more than " "500 points at the height of the rally. The Nasdaq and the S&P 500 slid by 0.8% during " "their lowest points in the day. Traders were unclear of the reason for the intraday " "reversal. Some noted it could be the softer wage number in the jobs report that made " "investors rethink their earlier bearish stance. Others noted the pullback in yields from " "the day’s highs. Part of the rally may just be to do a market that had gotten extremely " "oversold with the S&P 500 at one point this week down more than 9% from its high earlier " "this year. Yields initially surged after the report, with the 10-year Treasury rate trading " "near its highest level in 14 years. The benchmark rate later eased from those levels, but " "was still up around 6 basis points at 4.58%. 'We’re seeing a little bit of a give back " "in yields from where we were around 4.8%. [With] them pulling back a bit, I think that’s " "helping the stock market,' said Margaret Jones, chief investment officer at Vibrant Industries " "Capital Advisors. 'We’ve had a lot of weakness in the market in recent weeks, and potentially " "some oversold conditions.'"}, {"query": "What is the governing law?", "answer": "State of Massachusetts", "context": "19. Governing Law and Procedures. This Agreement shall be governed by and interpreted " "under the laws of the State of Massachusetts, except with respect to Section 18(a) of this Agreement," " which shall be governed by the laws of the State of Delaware, without giving effect to any " "conflict of laws provisions. Employer and Executive each irrevocably and unconditionally " "(a) agrees that any action commenced by Employer for preliminary and permanent injunctive relief " "or other equitable relief related to this Agreement or any action commenced by Executive pursuant " "to any provision hereof, may be brought in the United States District Court for the federal " "district in which Executive’s principal place of employment is located, or if such court does " "not have jurisdiction or will not accept jurisdiction, in any court of general jurisdiction " "in the state and county in which Executive’s principal place of employment is located, " "(b) consents to the non-exclusive jurisdiction of any such court in any such suit, action o" "r proceeding, and (c) waives any objection which Employer or Executive may have to the " "laying of venue of any such suit, action or proceeding in any such court. Employer and " "Executive each also irrevocably and unconditionally consents to the service of any process, " "pleadings, notices or other papers in a manner permitted by the notice provisions of Section 8."}, {"query": "What is the amount of the base salary?", "answer": "$200,000.", "context": "2.2. Base Salary. For all the services rendered by Executive hereunder, during the " "Employment Period, Employer shall pay Executive a base salary at the annual rate of " "$200,000, payable semimonthly in accordance with Employer’s normal payroll practices. " "Executive’s base salary shall be reviewed annually by the Board (or the compensation committee " "of the Board), pursuant to Employer’s normal compensation and performance review policies " "for senior level executives, and may be increased but not decreased. The amount of any " "increase for each year shall be determined accordingly. For purposes of this Agreement, " "the term “Base Salary” shall mean the amount of Executive’s base salary established " "from time to time pursuant to this Section 2.2. "}, {"query": "Is the expected gross margin greater than 70%?", "answer": "Yes, between 71.5% and 72.%", "context": "Outlook NVIDIA’s outlook for the third quarter of fiscal 2024 is as follows:" "Revenue is expected to be $16.00 billion, plus or minus 2%. GAAP and non-GAAP " "gross margins are expected to be 71.5% and 72.5%, respectively, plus or minus " "50 basis points. GAAP and non-GAAP operating expenses are expected to be " "approximately $2.95 billion and $2.00 billion, respectively. GAAP and non-GAAP " "other income and expense are expected to be an income of approximately $100 " "million, excluding gains and losses from non-affiliated investments. GAAP and " "non-GAAP tax rates are expected to be 14.5%, plus or minus 1%, excluding any discrete items." "Highlights NVIDIA achieved progress since its previous earnings announcement " "in these areas: Data Center Second-quarter revenue was a record $10.32 billion, " "up 141% from the previous quarter and up 171% from a year ago. Announced that the " "NVIDIA® GH200 Grace™ Hopper™ Superchip for complex AI and HPC workloads is shipping " "this quarter, with a second-generation version with HBM3e memory expected to ship " "in Q2 of calendar 2024. "}, {"query": "What is Bank of America's rating on Target?", "answer": "Buy", "context": "Here are some of the tickers on my radar for Thursday, Oct. 12, taken directly from " "my reporter’s notebook: It’s the one-year anniversary of the S&P 500′s bear market bottom " "of 3,577. Since then, as of Wednesday’s close of 4,376, the broad market index " "soared more than 22%. Hotter than expected September consumer price index, consumer " "inflation. The Social Security Administration issues announced a 3.2% cost-of-living " "adjustment for 2024. Chipotle Mexican Grill (CMG) plans price increases. Pricing power. " "Cites consumer price index showing sticky retail inflation for the fourth time " "in two years. Bank of America upgrades Target (TGT) to buy from neutral. Cites " "risk/reward from depressed levels. Traffic could improve. Gross margin upside. " "Merchandising better. Freight and transportation better. Target to report quarter " "next month. In retail, the CNBC Investing Club portfolio owns TJX Companies (TJX), " "the off-price juggernaut behind T.J. Maxx, Marshalls and HomeGoods. Goldman Sachs " "tactical buy trades on Club names Wells Fargo (WFC), which reports quarter Friday, " "Humana (HUM) and Nvidia (NVDA). BofA initiates Snowflake (SNOW) with a buy rating." "If you like this story, sign up for Jim Cramer’s Top 10 Morning Thoughts on the " "Market email newsletter for free. Barclays cuts price targets on consumer products: " "UTZ Brands (UTZ) to $16 per share from $17. Kraft Heinz (KHC) to $36 per share from " "$38. Cyclical drag. J.M. Smucker (SJM) to $129 from $160. Secular headwinds. " "Coca-Cola (KO) to $59 from $70. Barclays cut PTs on housing-related stocks: Toll Brothers" "(TOL) to $74 per share from $82. Keeps underweight. Lowers Trex (TREX) and Azek" "(AZEK), too. Goldman Sachs (GS) announces sale of fintech platform and warns on " "third quarter of 19-cent per share drag on earnings. The buyer: investors led by " "private equity firm Sixth Street. Exiting a mistake. Rise in consumer engagement for " "Spotify (SPOT), says Morgan Stanley. The analysts hike price target to $190 per share " "from $185. Keeps overweight (buy) rating. JPMorgan loves elf Beauty (ELF). Keeps " "overweight (buy) rating but lowers price target to $139 per share from $150. " "Sees “still challenging” environment into third-quarter print. The Club owns shares " "in high-end beauty company Estee Lauder (EL). Barclays upgrades First Solar (FSLR) " "to overweight from equal weight (buy from hold) but lowers price target to $224 per " "share from $230. Risk reward upgrade. Best visibility of utility scale names."}, {"query": "Who is NVIDIA's partner for the driver assistance system?", "answer": "MediaTek", "context": "Automotive Second-quarter revenue was $253 million, down 15% from the previous " "quarter and up 15% from a year ago. Announced that NVIDIA DRIVE Orin™ is powering " "the new XPENG G6 Coupe SUV’s intelligent advanced driver assistance system. " "Partnered with MediaTek, which will develop mainstream automotive systems on " "chips for global OEMs, which integrate new NVIDIA GPU chiplet IP for AI and graphics."}, {"query": "What was the rate of decline in 3rd quarter sales?", "answer": "20% year-on-year.", "context": "Nokia said it would cut up to 14,000 jobs as part of a cost cutting plan following " "third quarter earnings that plunged. The Finnish telecommunications giant said that " "it will reduce its cost base and increase operation efficiency to “address the " "challenging market environment. The substantial layoffs come after Nokia reported " "third-quarter net sales declined 20% year-on-year to 4.98 billion euros. Profit over " "the period plunged by 69% year-on-year to 133 million euros."}, {"query": "What was professional visualization revenue in the quarter?", "answer": "$379 million", "context": "Gaming Second-quarter revenue was $2.49 billion, up 11% from the previous quarter and up " "22% from a year ago. Began shipping the GeForce RTX™ 4060 family of GPUs, " "bringing to gamers NVIDIA Ada Lovelace architecture and DLSS, starting at $299." "Announced NVIDIA Avatar Cloud Engine, or ACE, for Games, a custom AI model " "foundry service using AI-powered natural language interactions to transform games " "by bringing intelligence to non-playable characters. Added 35 DLSS games, including " "Diablo IV, Ratchet & Clank: Rift Apart, Baldur’s Gate 3 and F1 23, as well as Portal: " "Prelude RTX, a path-traced game made by the community using NVIDIA’s RTX Remix creator tool." "Professional Visualization Second-quarter revenue was $379 million, up 28% from the " "previous quarter and down 24% from a year ago. Announced three new desktop " "workstation RTX GPUs based on the Ada Lovelace architecture — NVIDIA RTX 5000, RTX 4500 " "and RTX 4000 — to deliver the latest AI, graphics and real-time rendering, which are " "shipping this quarter. Announced a major release of the NVIDIA Omniverse platform, " "with new foundation applications and services for developers and industrial " "enterprises to optimize and enhance their 3D pipelines with OpenUSD and " "generative AI. Joined with Pixar, Adobe, Apple and Autodesk to form the " "Alliance for OpenUSD to promote the standardization, development, evolution and " "growth of Universal Scene Description technology."}, {"query": "What is the executive's title?", "answer": "Senior Vice President, Event Planning ('SVP') of the Workforce Optimization Division.", "context": "2.1. Duties and Responsibilities and Extent of Service. During the Employment Period, " "Executive shall serve as Senior Vice President, Event Planning (“SVP”) of the Employer’s " "Workforce Optimization Division. In such role, Executive will report to the Board of " "Directors of Employer (the “Board”) and shall devote substantially all of his business time " "and attention and his best efforts and ability to the operations of Employer and its subsidiaries. " "Executive shall be responsible for running Employer’s day-to-day operations and shall perform " "faithfully, diligently and competently the duties and responsibilities of a SVP and such other " "duties and responsibilities as directed by the Board and are consistent with such position. " "The foregoing shall not be construed as preventing Executive from (a) making passive " "investments in other businesses or enterprises consistent with Employer’s code of conduct, " "or (b) engaging in any other business activity consistent with Employer’s code of conduct; " "provided that Executive seeks and obtains the prior approval of the Board before engaging " "in any other business activity. In addition, it shall not be a violation of this Agreement " "for Executive to participate in civic or charitable activities, deliver lectures, fulfill " "speaking engagements, teach at educational institutions, and/or manage personal investments " "(subject to the immediately preceding sentence); provided that such activities do not " "interfere in any substantial respect with the performance of Executive’s responsibilities " "as an employee in accordance with this Agreement. Executive may also serve on one or more " "corporate boards of another company (and committees thereof) upon giving advance notice " "to the Board prior to commencing service on any other corporate board."}, {"query": "According to the CFO, what led to the increase in cloud revenue?", "answer": "Focused execution by our sales teams and partners", "context": "'The world's most advanced AI models " "are coming together with the world's most universal user interface - natural language - " "to create a new era of computing,' said Satya Nadella, chairman and chief " "executive officer of Microsoft. 'Across the Microsoft Cloud, we are the platform " "of choice to help customers get the most value out of their digital spend and innovate " "for this next generation of AI.' 'Focused execution by our sales teams and partners " "in this dynamic environment resulted in Microsoft Cloud revenue of $28.5 billion, " "up 22% (up 25% in constant currency) year-over-year,' said Amy Hood, executive " "vice president and chief financial officer of Microsoft.\n"}, {"query": "Which company is located in Nevada?", "answer": "North Industries", "context": "To send notices to Blue Moon Tech, mail to their headquarters at: " "555 California Street, San Francisco, California 94123. To send notices to North Industries, mail to" "their principal U.S. offices at: 19832 32nd Avenue, Las Vegas, Nevada 23593.\nTo send notices " "to Red River Industries, send to: One Red River Road, Stamford, Connecticut 08234."}, {"query": "When can termination after a material breach occur?", "answer": "If the breach is not cured within 15 days of notice of the breach.", "context": "This Agreement shall remain in effect until terminated. Either party may terminate this " "agreement, any Statement of Work or Services Description for convenience by giving the other " "party 30 days written notice. Either party may terminate this Agreement or any work order or " "services description if the other party is in material breach or default of any obligation " "that is not cured within 15 days’ notice of such breach. The TestCo agrees to pay all fees " "for services performed and expenses incurred prior to the termination of this Agreement. " "Termination of this Agreement will terminate all outstanding Statement of Work or Services " "Description entered into under this agreement."}, {"query": "What is a headline summary in 10 words or less?", "answer": "Joe Biden is the 46th President of the United States.", "context": "Joe Biden's tenure as the 46th president of the United States began with " "his inauguration on January 20, 2021. Biden, a Democrat from Delaware who " "previously served as vice president under Barack Obama, " "took office following his victory in the 2020 presidential election over " "Republican incumbent president Donald Trump. Upon his inauguration, he " "became the oldest president in American history."}, {"query": "Who are the two people that won elections in Georgia?", "answer": "Jon Ossoff and Raphael Warnock", "context": "Though Biden was generally acknowledged as the winner, " "General Services Administration head Emily W. Murphy " "initially refused to begin the transition to the president-elect, " "thereby denying funds and office space to his team. " "On November 23, after Michigan certified its results, Murphy " "issued the letter of ascertainment, granting the Biden transition " "team access to federal funds and resources for an orderly transition. " "Two days after becoming the projected winner of the 2020 election, " "Biden announced the formation of a task force to advise him on the " "COVID-19 pandemic during the transition, co-chaired by former " "Surgeon General Vivek Murthy, former FDA commissioner David A. Kessler, " "and Yale University's Marcella Nunez-Smith. On January 5, 2021, " "the Democratic Party won control of the United States Senate, " "effective January 20, as a result of electoral victories in " "Georgia by Jon Ossoff in a runoff election for a six-year term " "and Raphael Warnock in a special runoff election for a two-year term. " "President-elect Biden had supported and campaigned for both " "candidates prior to the runoff elections on January 5.On January 6, " "a mob of thousands of Trump supporters violently stormed the Capitol " "in the hope of overturning Biden's election, forcing Congress to " "evacuate during the counting of the Electoral College votes. More " "than 26,000 National Guard members were deployed to the capital " "for the inauguration, with thousands remaining into the spring."}, {"query": "What is the list of the top financial highlights for the quarter?", "answer": "•Revenue: $52.9 million, up 10% in constant currency;\n" "•Operating income: $22.4 billion, up 15% in constant currency;\n" "•Net income: $18.3 billion, up 14% in constant currency;\n" "•Diluted earnings per share: $2.45 billion, up 14% in constant currency.", "context": "Microsoft Cloud Strength Drives Third Quarter Results \nREDMOND, Wash. — April 25, 2023 — " "Microsoft Corp. today announced the following results for the quarter ended March 31, 2023," " as compared to the corresponding period of last fiscal year:\n· Revenue was $52.9 billion" " and increased 7% (up 10% in constant currency)\n· Operating income was $22.4 billion " "and increased 10% (up 15% in constant currency)\n· Net income was $18.3 billion and " "increased 9% (up 14% in constant currency)\n· Diluted earnings per share was $2.45 " "and increased 10% (up 14% in constant currency).\n"}, {"query": "What is a list of the key points?", "answer": "•Stocks rallied on Friday with stronger-than-expected U.S jobs data and increase in " "Treasury yields;\n•Dow Jones gained 195.12 points;\n•S&P 500 added 1.59%;\n•Nasdaq Composite rose " "1.35%;\n•U.S. economy added 438,000 jobs in August, better than the 273,000 expected;\n" "•10-year Treasury rate trading near the highest level in 14 years at 4.58%.", "context": "Stocks rallied Friday even after the release of stronger-than-expected U.S. jobs data " "and a major increase in Treasury yields. The Dow Jones Industrial Average gained 195.12 points, " "or 0.76%, to close at 31,419.58. The S&P 500 added 1.59% at 4,008.50. The tech-heavy " "Nasdaq Composite rose 1.35%, closing at 12,299.68. The U.S. economy added 438,000 jobs in " "August, the Labor Department said. Economists polled by Dow Jones expected 273,000 " "jobs. However, wages rose less than expected last month. Stocks posted a stunning " "turnaround on Friday, after initially falling on the stronger-than-expected jobs report. " "At its session low, the Dow had fallen as much as 198 points; it surged by more than " "500 points at the height of the rally. The Nasdaq and the S&P 500 slid by 0.8% during " "their lowest points in the day. Traders were unclear of the reason for the intraday " "reversal. Some noted it could be the softer wage number in the jobs report that made " "investors rethink their earlier bearish stance. Others noted the pullback in yields from " "the day’s highs. Part of the rally may just be to do a market that had gotten extremely " "oversold with the S&P 500 at one point this week down more than 9% from its high earlier " "this year. Yields initially surged after the report, with the 10-year Treasury rate trading " "near its highest level in 14 years. The benchmark rate later eased from those levels, but " "was still up around 6 basis points at 4.58%. 'We’re seeing a little bit of a give back " "in yields from where we were around 4.8%. [With] them pulling back a bit, I think that’s " "helping the stock market,' said Margaret Jones, chief investment officer at Vibrant Industries " "Capital Advisors. 'We’ve had a lot of weakness in the market in recent weeks, and potentially " "some oversold conditions.'"} ] return test_list def bling_meets_llmware_hello_world (model_name): """ Simple inference loop that loads a model and runs through a series of test questions. """ t0 = time.time() test_list = hello_world_questions() print(f"\n > Loading Model: {model_name}...") prompter = Prompt().load_model(model_name) t1 = time.time() print(f"\n > Model {model_name} load time: {t1-t0} seconds") for i, entries in enumerate(test_list): print(f"\n{i+1}. Query: {entries['query']}") # run the prompt output = prompter.prompt_main(entries["query"],context=entries["context"] , prompt_name="default_with_context",temperature=0.30) llm_response = output["llm_response"].strip("\n") print(f"LLM Response: {llm_response}") print(f"Gold Answer: {entries['answer']}") print(f"LLM Usage: {output['usage']}") t2 = time.time() print(f"\nTotal processing time: {t2-t1} seconds") return 0 if __name__ == "__main__": # list of 'rag-instruct' laptop-ready bling models on HuggingFace model_list = ["llmware/bling-1b-0.1", "llmware/bling-tiny-llama-v0", "llmware/bling-1.4b-0.1", "llmware/bling-falcon-1b-0.1", "llmware/bling-cerebras-1.3b-0.1", "llmware/bling-sheared-llama-1.3b-0.1", "llmware/bling-sheared-llama-2.7b-0.1", "llmware/bling-red-pajamas-3b-0.1", "llmware/bling-stable-lm-3b-4e1t-v0", "llmware/bling-phi-3", # use GGUF models too "bling-phi-3-gguf", # quantized bling-phi-3 "bling-answer-tool", # quantized bling-tiny-llama "bling-stablelm-3b-tool" # quantized bling-stablelm-3b ] # try the newest bling model - 'tiny-llama' bling_meets_llmware_hello_world(model_list[1]) ``` For more examples, see the [models examples]((https://www.github.com/llmware-ai/llmware/tree/main/examples/Models/) in the main repo. Check back often - we are updating these examples regularly - and many of these examples have companion videos as well. # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
---
--- --- --- layout: default title: Notebooks parent: Examples nav_order: 11 description: overview of the major modules and classes of LLMWare permalink: /examples/notebooks --- # Notebooks - Introduction by Examples We introduce ``llmware`` through self-contained examples. # Understanding Google Colab and Jupyter Notebooks Welcome to our project documentation! A common point of confusion among developers new to data science and machine learning workflows is the relationship and differences between Google Colab and Jupyter Notebooks. This README aims to clarify these parts to ensure everyone is on the same page. ## What are Jupyter Notebooks? Jupyter Notebooks is an open-source web application that lets you create and share documents that have live code, equations, visualizations, and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more. ## What is Google Colab? Google Colab (or Colaboratory) is a free Jupyter notebook environment that requires no setup and runs in the cloud. It offers a similar interface to Jupyter Notebooks and lets users write and execute Python in a web browser. Google Colab also provides free access to computing resources, including GPUs and TPUs, making it highly popular for machine learning and data analysis projects. ## Key Similarities - **Interface:** Both platforms use the Jupyter Notebook interface, which supports mixing executable code, equations, visualizations, and narrative text in a single document. - **Language Support:** Primarily, both are used for executing Python code. However, Jupyter Notebooks support other languages such as R and Julia. - **Use Cases:** They are widely used for data analysis, machine learning, and education, allowing for easy sharing of results and methodologies. ## Increase Google Colab Computational Power with T4 GPU Our models are designed to run on at least 16GB of RAM. By default Google Colab provides ~13GB of RAM, which significantly slows computational speed. To ensure the best performance when using our models, we highly recommend enabling the T4 GPU in Colab. This will provide the notebook with additional resources, including 16GB of RAM, allowing our models to run smoothly and efficiently. Steps to enabling T4 GPU in Colab: 1. In your Colab notebook, click on the "Runtime" tab 2. Select "Change runtime type" 3. Under "Hardware Accelerator", select T4 GPU NOTE: There is a weekly usage limit on using T4 for free. ## Key Differences - **Execution Environment:** Jupyter Notebooks can be run locally on your machine or on a server, but Google Colab is hosted in the cloud. - **Access to Resources:** Google Colab provides free access to hardware accelerators (GPUs and TPUs) which is not inherently available in Jupyter Notebooks unless specifically set up by the user on their servers. - **Collaboration:** Google Colab offers easier collaboration features, similar to Google Docs, letting multiple users work on the same notebook simultaneously. ## Conclusion While Google Colab and Jupyter Notebooks might seem different they are built on the same idea and offer similar functionalities with a few distinctions, mainly in execution environment and access to computing resources. Understanding these platforms' capabilities can significantly enhance your data science and machine learning projects. We hope this guide has helped clarify the similarities and differences between Google Colab and Jupyter Notebooks. Happy coding! --- --- layout: default title: Parsing parent: Examples nav_order: 4 description: overview of the major modules and classes of LLMWare permalink: /examples/parsing --- # Parsing - Introduction by Examples We introduce ``llmware`` through self-contained examples. 🚀 Parsing Examples 🚀 =============== **Parsing is the Humble Hero of Good RAG Pipelines** LLMWare supports parsing of a wide range of unstructured content types, and views parsing, text chunking and indexing as the first step in the pipeline, and like any pipeline, care and attention to getting "great input" is usually the key to "great output." In this repository, we show several key features of parsing with llmware: **Parsing PDFs like a Pro** - Configuring text chunking and extraction parameters - [**PDF Configuration**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/pdf_parser_new_configs.py) - PDF Table extraction - [**PDF Table**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/pdf_table_extraction.py) - Fallback to OCR - [**PDF-by-OCR**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/parse_pdf_by_ocr.py) **Parsing Office Documents (Powerpoints, Word, Excel)** - Configuring text chunking and extraction parameters - [**Office Configuration**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/office_parser_new_configs.py) - Handling ZIPs and mixed file types - [**Microsoft IR Documents**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/parsing_microsoft_ir_docs.py) - Running OCR on Images Extracted - [**OCR Embedded Doc Images**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/ocr_embedded_doc_images.py) **Parsing without a Database** - Parse in Memory - [**Parse in Memory**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/parse_in_memory.py) - Parse directly into a Prompt - [**Parse in Prompt**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/parse_into_prompt.py) - Parse to JSON file - [**Parse to JSON**](https://www.github.com/llmware-ai/llmware/tree/main/examples/main/examples/Parsing/parse_to_json.py) **Other Content Types** - Custom CSV - [**Custom CSV files**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/parse_csv_custom.py) - Custom JSON - [**Custom JSON files**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/parse_jsonl_custom.py) - Images - [**OCR on Images**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/parse_images.py) - Web/HTML - [**Website Extraction**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/parse_web_sources_in_memory.py) - Voice (WAV) - in Use_Cases - [**Parsing Great Speeches**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/parsing_great_speeches.py) For more examples, see the [parsing examples]((https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/) in the main repo. Check back often - we are updating these examples regularly - and many of these examples have companion videos as well. ### **Let's get started! 🚀** --- --- layout: default title: Prompts parent: Examples nav_order: 6 description: overview of the major modules and classes of LLMWare permalink: /examples/prompts --- # Prompts - Introduction by Examples We introduce ``llmware`` through self-contained examples. # Basic RAG Scenario - Invoice Processing ```python """ This example shows an end-to-end scenario for invoice processing that can be run locally and without a database. The example shows how to combine the use of parsing combined with prompts_with_sources to rapidly iterate through a batch of invoices and ask a set of questions, and then save the full output to both (1) .jsonl for integration into an upstream application/database and (2) to a CSV for human review in excel. note: the sample code pulls from a public repo to load the sample invoice documents the first time - please feel free to substitute with your own invoice documents (PDF/DOCX/PPTX/XLSX/CSV/TXT) if you prefer. this example does not require a database or embedding this example can be run locally on a laptop by setting 'run_on_cpu=True' if 'run_on_cpu==False", then please see the example 'launch_llmware_inference_server.py' to configure and set up a 'pop-up' GPU inference server in just a few minutes """ import os import re from llmware.prompts import Prompt, HumanInTheLoop from llmware.configs import LLMWareConfig from llmware.setup import Setup from llmware.models import ModelCatalog def invoice_processing(run_on_cpu=True): # Step 1 - Pull down the sample files from S3 through the .load_sample_files() command # --note: if you need to refresh the sample files, set 'over_write=True' print("update: Downloading Sample Files") sample_files_path = Setup().load_sample_files(over_write=False) invoices_path = os.path.join(sample_files_path, "Invoices") # Step 2 - simple sample query list - each question will be asked to each invoice query_list = ["What is the total amount of the invoice?", "What is the invoice number?", "What are the names of the two parties?"] # Step 3 - Load Model if run_on_cpu: # load local bling model that can run on cpu/laptop # note: bling-1b-0.1 is the *fastest* & *smallest*, but will make more errors than larger BLING models # model_name = "llmware/bling-1b-0.1" # try the new bling-phi-3 quantized with gguf - most accurate model_name = 'bling-phi-3-gguf' else: # use GPU-based inference server to process # *** see the launch_llmware_inference_server.py example script to setup *** server_uri_string = "http://11.123.456.789:8088" # insert your server_uri_string server_secret_key = "demo-test" ModelCatalog().setup_custom_llmware_inference_server(server_uri_string, secret_key=server_secret_key) model_name = "llmware-inference-server" # attach inference server to prompt object prompter = Prompt().load_model(model_name) # Step 4 - main loop thru folder of invoices for i, invoice in enumerate(os.listdir(invoices_path)): # just in case (legacy on mac os file system - not needed on linux or windows) if invoice != ".DS_Store": print("\nAnalyzing invoice: ", str(i + 1), invoice) for question in query_list: # Step 4A - parses the invoices in memory and attaches as a source to the Prompt source = prompter.add_source_document(invoices_path,invoice) # Step 4B - executes the prompt on the LLM (with the loaded source) output = prompter.prompt_with_source(question,prompt_name="default_with_context") for i, response in enumerate(output): print("LLM Response - ", question, " - ", re.sub("[\n]"," ", response["llm_response"])) prompter.clear_source_materials() # Save jsonl report with full transaction history to /prompt_history folder print("\nupdate: prompt state saved at: ", os.path.join(LLMWareConfig.get_prompt_path(),prompter.prompt_id)) prompter.save_state() # Generate CSV report for easy Human review in Excel csv_output = HumanInTheLoop(prompter).export_current_interaction_to_csv() print("\nupdate: csv output for human review - ", csv_output) return 0 if __name__ == "__main__": invoice_processing(run_on_cpu=True) ``` # Document Summarizer ```python """ This Example shows a packaged 'document_summarizer' prompt using the slim-summary-tool. It shows a variety of techniques to summarize documents generally larger than a LLM context window, and how to assemble multiple source batches from the document, as well as using a 'query' and 'topic' to focus on specific segments of the document. """ import os from llmware.prompts import Prompt from llmware.setup import Setup def test_summarize_document(example="jd salinger"): # pull a sample document (or substitute a file_path and file_name of your own) sample_files_path = Setup().load_sample_files(over_write=False) topic = None query = None fp = None fn = None if example not in ["jd salinger", "employment terms", "just the comp", "un resolutions"]: print ("not found example") return [] if example == "jd salinger": fp = os.path.join(sample_files_path, "SmallLibrary") fn = "Jd-Salinger-Biography.docx" topic = "jd salinger" query = None if example == "employment terms": fp = os.path.join(sample_files_path, "Agreements") fn = "Athena EXECUTIVE EMPLOYMENT AGREEMENT.pdf" topic = "executive compensation terms" query = None if example == "just the comp": fp = os.path.join(sample_files_path, "Agreements") fn = "Athena EXECUTIVE EMPLOYMENT AGREEMENT.pdf" topic = "executive compensation terms" query = "base salary" if example == "un resolutions": fp = os.path.join(sample_files_path, "SmallLibrary") fn = "N2126108.pdf" # fn = "N2137825.pdf" topic = "key points" query = None # optional parameters: 'query' - will select among blocks with the query term # 'topic' - will pass a topic/issue as the parameter to the model to 'focus' the summary # 'max_batch_cap' - caps the number of batches sent to the model # 'text_only' - returns just the summary text aggregated kp = Prompt().summarize_document_fc(fp, fn, topic=topic, query=query, text_only=True, max_batch_cap=15) print(f"\nDocument summary completed - {len(kp)} Points") for i, points in enumerate(kp): print(i, points) return 0 if __name__ == "__main__": print(f"\nExample: Summarize Documents\n") # 4 examples - ["jd salinger", "employment terms", "just the comp", "un resolutions"] # -- "jd salinger" - summarizes key points about jd salinger from short biography document # -- "employment terms" - summarizes the executive compensation terms across 15 page document # -- "just the comp" - queries to find subset of document and then summarizes the key terms # -- "un resolutions" - summarizes the un resolutions document summary_direct = test_summarize_document(example="employment terms") ``` For more examples, see the [prompt examples]((https://www.github.com/llmware-ai/llmware/tree/main/examples/Prompts/) in the main repo. Check back often - we are updating these examples regularly - and many of these examples have companion videos as well. # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
---
--- --- --- layout: default title: Retrieval parent: Examples nav_order: 7 description: overview of the major modules and classes of LLMWare permalink: /examples/retrieval --- # Retrieval - Introduction by Examples We introduce ``llmware`` through self-contained examples. # SEMANTIC Retrieval Example ```python """ This 'getting started' example demonstrates how to use basic semantic retrieval with the Query class 1. Create a sample library 2. Run a basic semantic query 3. View the results """ import os from llmware.library import Library from llmware.retrieval import Query from llmware.setup import Setup from llmware.configs import LLMWareConfig def create_fin_docs_sample_library(library_name): print(f"update: creating library - {library_name}") library = Library().create_new_library(library_name) sample_files_path = Setup().load_sample_files(over_write=False) ingestion_folder_path = os.path.join(sample_files_path, "FinDocs") parsing_output = library.add_files(ingestion_folder_path) print(f"update: building embeddings - may take a few minutes the first time") # note: if you have installed Milvus or another vector DB, please feel free to substitute # note: if you have any memory constraints on laptop: # (1) reduce embedding batch_size or ... # (2) substitute "mini-lm-sbert" as embedding model library.install_new_embedding(embedding_model_name="industry-bert-sec", vector_db="chromadb",batch_size=200) return library def basic_semantic_retrieval_example (library): # Create a Query instance q = Query(library) # Set the keys that should be returned - optional - full set of keys will be returned by default q.query_result_return_keys = ["distance","file_source", "page_num", "text"] # perform a simple query my_query = "ESG initiatives" query_results1 = q.semantic_query(my_query, result_count=20) # Iterate through query_results, which is a list of result dicts print(f"\nQuery 1 - {my_query}") for i, result in enumerate(query_results1): print("results - ", i, result) # perform another query my_query2 = "stock performance" query_results2 = q.semantic_query(my_query2, result_count=10) print(f"\nQuery 2 - {my_query2}") for i, result in enumerate(query_results2): print("results - ", i, result) # perform another query my_query3 = "cloud computing" # note: use of embedding_distance_threshold will cap results with distance < 1.0 query_results3 = q.semantic_query(my_query3, result_count=50, embedding_distance_threshold=1.0) print(f"\nQuery 3 - {my_query3}") for i, result in enumerate(query_results3): print("result - ", i, result) return [query_results1, query_results2, query_results3] if __name__ == "__main__": print(f"Example - Running a Basic Semantic Query") LLMWareConfig().set_active_db("sqlite") # step 1- will create library + embeddings with Financial Docs lib = create_fin_docs_sample_library("lib_semantic_query_1") # step 2- run query against the library and embeddings my_results = basic_semantic_retrieval_example(lib) ``` For more examples, see the [retrieval examples]((https://www.github.com/llmware-ai/llmware/tree/main/examples/Retrieval/) in the main repo. Check back often - we are updating these examples regularly - and many of these examples have companion videos as well. # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
---
--- --- --- layout: default title: Structured Tables parent: Examples nav_order: 9 description: overview of the major modules and classes of LLMWare permalink: /examples/structured_tables --- # Structured Tables - Introduction by Examples We introduce ``llmware`` through self-contained examples. ```python """ This example shows the basic recipe for creating a CustomTable with LLMWare and a few of the basic methods to quickly get started. In this example, we will build a very simple 'hello world' Files table, which we will build upon in a future example by aggregating a more interesting and useful set of attributes from a LLMWare Library collection. CustomTable is designed to work with the text collection databases supported by LLMWare: SQL DBs --- Postgres and SQLIte NoSQL DB --- Mongo DB Even though Mongo does not require a schema for inserting and retrieving information, the CustomTable method will expect a defined schema to be provided (good best practice, in any case). """ from llmware.resources import CustomTable def hello_world_custom_table(): # simple schema for a table to track Files/Documents # note: the schema is a python dictionary, with named keys, and the value corresponding to the data type # for sqlite and postgres, any standard sql data type should generally work files_schema = {"custom_doc_num": "integer", "file_name": "text", "comments": "text"} # create a CustomTable object db_name = "sqlite" table_name = "files_table_1000" ct = CustomTable(db=db_name,table_name=table_name, schema=files_schema) # insert a few sample rows - each row is a dictionary with keys from the schema, and the *actual* values r1 = {"custom_doc_num": 1, "file_name": "technical_manual.pdf", "comments": "very useful overview"} ct.write_new_record(r1) r2 = {"custom_doc_num": 2, "file_name": "work_presentation.pptx", "comments": "need to save for future reference"} ct.write_new_record(r2) r3 = {"custom_doc_num": 3, "file_name": "dataset.json", "comments": "will use in next project"} ct.write_new_record(r3) # to see the entries - pull all items from the table all_results = ct.get_all() print("\nTEST #1 - Retrieving All Elements") for i, res in enumerate(all_results): print("results: ", i, res) # look at the database schema schema = ct.get_schema() print("\nTEST #2 - Getting the Table Schema") print("schema: ", schema) schema_str = ct.sql_table_create_string() print("table create sql: ", schema_str) # perform a basic lookup with 'key' and 'value' f = ct.lookup("custom_doc_num", 2) print("\nTEST #3 - Basic Lookup - 'custom_doc_num' = 2") print("lookup: ", f) # if you prefer SQL, pass a SQL query directly (note: this will only work on Postgres and SQLite) if db_name == "sqlite": # note: our standard 'unpacking' of a row of sqlite includes the rowid attribute custom_query = f"SELECT rowid, * FROM {table_name} WHERE custom_doc_num = 3;" elif db_name == "postgres": custom_query = f"SELECT * FROM {table_name} WHERE custom_doc_num = 3;" elif db_name == "mongo": custom_query = {"custom_doc_num": 3} else: print("must use either sqlite, postgres or mongo") return -1 cf = ct.custom_lookup(custom_query) print("\nTEST #4 - Custom SQL Lookup - 'custom_doc_num' = 3") print("custom query lookup: ", cf) print("\nTEST #5 - Making Updates and Deletes") # to delete a record ct.delete_record("custom_doc_num", 1) print("deleted record") # to update the values of a record ct.update_record({"custom_doc_num": 2}, "file_name", "work_presentation_update_v2.pptx") print("updated record") updated_all_results = ct.get_all() for i, res in enumerate(updated_all_results): print("updated results: ", i, res) print("\nTEST #6 - Delete Table - uncomment and set confirm=True") # done? delete the table and start over # -- note: confirm=True must be set # ct.delete_table(confirm=False) # look at all tables in the database tables = ct.list_all_tables() print("\nTEST #7 - View all of the tables on the DB") for i, t in enumerate(tables): print("tables:" ,i, t) return 0 if __name__ == "__main__": hello_world_custom_table() ``` These examples illustrate the use of the CustomTable class to quickly create SQL tables that can be used in conjunction with LLM-based workflows. 1. [**Intro to CustomTables**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Structured_Tables/create_custom_table-1.py) - Getting started with using CustomTables 2. [**Loading CSV into CustomTables**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Structured_Tables/loading_csv_into_custom_table-2a.py) - Loading CSV into CustomTables 3. [**Loading CSV into Library (Configured)**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Structured_Tables/loading_csv_w_config_options-2b.py) - Loading CSV into Library 4. [**Loading JSON into CustomTables**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Stuctured_Tables/loading_json_custom_table-3a.py) - Loading JSON into CustomTable database 5 [**Loading JSON into Library (Configured)**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Stuctured_Tables/loading_json_w_config_options-3b.py) - Loading JSON into a library with configuration For more examples, see the [structured tables example]((https://www.github.com/llmware-ai/llmware/tree/main/examples/Structured_Tables/) in the main repo. Check back often - we are updating these examples regularly - and many of these examples have companion videos as well. # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
---
--- --- --- layout: default title: UI parent: Examples nav_order: 8 description: overview of the major modules and classes of LLMWare permalink: /examples/ui --- # UI - Introduction by Examples We introduce ``llmware`` through self-contained examples. **UI Scenarios** We provide several 'UI' examples that show how to use LLMWare in a complex recipe combining different elements to accomplish a specific objective. While each example is still high-level, it is shared in the spirit of providing a high-level framework 'starting point' that can be developed in more detail for a variety of common use cases. All of these examples use small, specialized models, running locally - 'Small, but Mighty' ! 1. [**GGUF Streaming Chatbot**](https://www.github.com/llmware-ai/llmware/tree/main/examples/UI/gguf_streaming_chatbot.py) - Locally deployed chatbot using leading open source chat models, including Phi-3-GGUF - Uses Streamlit - Core simple framework of ~20 lines using llmware and Streamlit 2. [**Simple RAG UI with Streamlit**](https://www.github.com/llmware-ai/llmware/tree/main/examples/UI/simple_rag_ui_with_streamlit.py) - Simple RAG UI 3. [**RAG UI with Query Topic with Streamlit**](https://www.github.com/llmware-ai/llmware/tree/main/examples/UI/rag_ui_with_query_topic_with_streamlit.py) - UI demonstrating UI with query topic in RAG scenario 4. [**Using Streamlit Chat UI**](https://www.github.com/llmware-ai/llmware/tree/main/examples/UI/using_streamlit_chat_ui.py) - Basic Streamlit Chat UI For more examples, see the [UI examples]((https://www.github.com/llmware-ai/llmware/tree/main/examples/UI/) in the main repo. Check back often - we are updating these examples regularly - and many of these examples have companion videos as well. # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
---
--- --- --- layout: default title: Use Cases parent: Examples nav_order: 1 description: overview of the major modules and classes of LLMWare permalink: /examples/use_cases --- 🚀 Use Cases Examples 🚀 --- **End-to-End Scenarios** We provide several 'end-to-end' examples that show how to use LLMWare in a complex recipe combining different elements to accomplish a specific objective. While each example is still high-level, it is shared in the spirit of providing a high-level framework 'starting point' that can be developed in more detail for a variety of common use cases. All of these examples use small, specialized models, running locally - 'Small, but Mighty' ! 1. [**Research Automation with Agents and Web Services**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/web_services_slim_fx.py) - Prepare a 30-key research analysis on a company - Extract key lookup and other information from an earnings press release - Automatically use the lookup data for real-time stock information from YFinance - Automatically use the lookup date for background company history information in Wikipedia - Run LLM prompts to ask key questions of the Wikipedia sources - Aggregate into a consolidated research analysis - All with local open source models 2. [**Invoice Processing**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/invoice_processing.py) - Parse a batch of invoices (provided as sample files) - Extract key information from the invoices - Save the prompt state for follow-up review and analysis 3. [**Analyzing and Extracting Voice Transcripts**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/parsing_great_speeches.py) - Voice transcription of 50+ wav files of great speeches of the 20th century - Run text queries against the transcribed wav files - Execute LLM agent inferences to extract and identify key elements of interest - Prepare 'bibliography' with the key extracted points, including time-stamp 4. [**MSA Processing**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/msa_processing.py) - Identify the termination provisions in Master Service Agreements among a larger batch of contracts - Parse and query a large batch of contracts and identify the agreements with "Master Service Agreement" on the first page - Find the termination provisions in each MSA - Prompt LLM to read the termination provisions and answer a key question - Run a fact-check and source-check on the LLM response - Save all of the responses in CSV and JSON for follow-up review. 5. [**Querying a CSV**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/agent_with_custom_tables.py) - Start running natural language queries on CSVs with Postgres and slim-sql-tool. - Load a sample 'customer_table.csv' into Postgres - Start running natural language queries that get converted into SQL and query the DB 6. [**Contract Analysis**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/contract_analysis_on_laptop_with_bling_models.py) - Extract key information from set of employment agreement - Use a simple retrieval strategy with keyword search to identify key provisions and topic areas - Prompt LLM to read the key provisions and answer questions based on those source materials 7. [**Slicing and Dicing Office Docs**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/slicing_and_dicing_office_docs.py) - Shows a variety of advanced parsing techniques with Office document formats packaged in ZIP archives - Extracts tables and images, runs OCR against the embedded images, exports the whole library, and creates dataset For more examples, see the [use cases example]((https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/) in the main repo. Check back often - we are updating these examples regularly - and many of these examples have companion videos as well. # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
--- --- --- --- layout: default title: Clone Repo parent: Getting Started nav_order: 3 permalink: /getting_started/clone_repo --- ## ✍️ Working with the llmware Github repository The llmware repo can be pulled locally to get access to all the examples, or to work directly with the latest version of the llmware code. ```bash git clone git@github.com:llmware-ai/llmware.git ``` We have provided a **welcome_to_llmware** automation script in the root of the repository folder. After cloning: - On Windows command line: `.\welcome_to_llmware_windows.sh` - On Mac / Linux command line: `sh ./welcome_to_llmware.sh` Alternatively, if you prefer to complete setup without the welcome automation script, then the next steps include: 1. **install requirements.txt** - inside the /llmware path - e.g., ```pip3 install -r llmware/requirements.txt``` 2. **install requirements_extras.txt** - inside the /llmware path - e.g., ```pip3 install -r llmware/requirements_extras.txt``` (Depending upon your use case, you may not need all or any of these installs, but some of these will be used in the examples.) 3. **run examples** - copy one or more of the example .py files into the root project path. (We have seen several IDEs that will attempt to run interactively from the nested /example path, and then not have access to the /llmware module - the easy fix is to just copy the example you want to run into the root path). 4. **install vector db** - no-install vector db options include milvus lite, chromadb, faiss and lancedb - which do not require a server install, but do require that you install the python sdk library for that vector db, e.g., `pip3 install pymilvus`, or `pip3 install chromadb`. If you look in [examples/Embedding](https://github.com/llmware-ai/llmware/tree/main/examples/Embedding), you will see examples for getting started with various vector DB, and in the root of the repo, you will see easy-to-get-started docker compose scripts for installing milvus, postgres/pgvector, mongo, qdrant, neo4j, and redis. 5. Note: we have seen recently issues with Pytorch==2.3 on some platforms - if you run into any issues, we have seen that uninstalling Pytorch and downleveling to Pytorch==2.1 usually solves the problem. # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
---
--- --- --- layout: default title: Fast Start parent: Getting Started nav_order: 4 permalink: /getting_started/fast_start --- Fast Start: Learning RAG with llmware through 6 examples --- **Welcome to llmware!** Fast Start is a structured series of 6 self-contained examples and accompanying videos that walk through the core foundational components of RAG with LLMWare. Set up `pip3 install llmware` or, if you prefer clone the github repo locally, e.g., `git clone git@github.com:llmware-ai/llmware.git `. Platforms: - Mac M1/M2/M3, Windows, Linux (Ubuntu 20 or Ubuntu 22 preferred) - RAM: 16 GB minimum - Python 3.9, 3.10, 3.11, 3.12 - Pull the latest version of llmware == 0.2.14 (as of mid-May 2024) - Please note that we have updated the examples from the original versions, to use new features in llmware, so there may be minor differences with the videos, which are annotated in the comments in each example. There are 6 examples, designed to be used step-by-step, but each is self-contained, so you can feel free to jump into any of the examples, in any order, that you prefer. Each example has been designed to be "copy-paste" and RUN with lots of helpful comments and explanations embedded in the code samples. Please check out our [Fast Start Youtube tutorials](https://www.youtube.com/playlist?list=PL1-dn33KwsmD7SB9iSO6vx4ZLRAWea1DB) that walk through each example below. Examples: **Section I - Learning the Main Components** 1. **Library** - parse, text chunk, and index to convert a "pile of files" into an AI-ready knowledge-base. [Video](https://youtu.be/2xDefZ4oBOM?si=8vRCvqj0-HG3zc4c) 2. **Embeddings** - apply an embedding model to the Library, store vectors, and start enabling natural language queries. [Video](https://youtu.be/xQEk6ohvfV0?si=B3X25ZsAZfW4AR_3) 3. **Prompts** & **Model Catalog** - start running inferences and building prompts. [Video](https://youtu.be/swiu4oBVfbA?si=0IVmLhiiYS3-pMIg) **Section II - Connecting Knowledge with Prompts - 3 scenarios** 4. **RAG with Text Query** - start integrating documents into prompts. [Video](https://youtu.be/6oALi67HP7U?si=pAbvio4ULXTIXKdL) 5. **RAG with Semantic Query** - use natural language queries on documents and integrate with prompts. [Video](https://youtu.be/XT4kIXA9H3Q?si=EBCAxVXBt5vgYY8s) 6. **RAG with more complex retrieval** - start integrating more complex retrieval patterns. [Video](https://youtu.be/G1Q6Ar8THbo?si=vIVAv35uXAcnaUJy) After completing these 6 examples, you should have a good foundation and set of recipes to start exploring the other 100+ examples in the /examples folder, and build more sophisticated LLM-based applications. **Models** - All of these examples are optimized for using local CPU-based models, primarily BLING and DRAGON. - If you want to substitute for any other model in the catalog, it is generally as easy as switching the model_name. If the model requires API keys, we show in the examples how to pass those keys as an environment variable. **Collection Databases** - Our parsers are optimized to index text chunks directly into a persistent data store. - For Fast Start, we will use "sqlite" which is an embedded database, requiring no install - For more scalable deployment, we would recommend either "mongo" or "postgres" - Install instructions for "mongo" and "postgres" are provided in docker-compose files in the repository **Vector Databases** - For Fast Start, we will use "chromadb" in persistent 'file' mode, requiring no install. - Note: if you are using Python < 3.12, then please feel free to substitute for faiss (which was used in the videos). - Note: depending upon how and when you installed llmware, you may need to `pip install chromadb`. - For more scalable deployment, we would recommend installing one of 9 supported vector databases, including Milvus, PGVector (Postgres), Redis, Qdrant, Neo4j, Mongo-Atlas, Chroma, LanceDB, or Pinecone. - Install instructions provided in "examples/Embedding" for specific db, as well as docker-compose scripts. **Local Private** - All of the processing will take place locally on your laptop. *This is an ongoing initiative to provide easy-to-get-started tutorials - we welcome and encourage feedback, as well as contributions with examples and other tips for helping others on their LLM application journeys!* **Let's get started!** # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
---
--- --- --- layout: default title: Getting Started nav_order: 2 has_children: true description: getting started with llmware permalink: /getting_started --- ## Welcome to
  • llmware
## 🧰🛠️🔩The Ultimate Toolkit for Enterprise RAG Pipelines with Small, Specialized Models From quickly building POCs to scalable LLM Apps for the enterprise, LLMWare is packed with all the tools you need. `llmware` is an integrated framework with over 50+ small, specialized, open source models for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows. This project provides a comprehensive set of tools that anyone can use - from a beginner to the most sophisticated AI developer - to rapidly build industrial-grade, knowledge-based enterprise LLM applications. Our specific focus is on making it easy to integrate open source small specialized models and connecting enterprise knowledge safely and securely. ## Getting Started 1. Install llmware - `pip3 install llmware` 2. Make sure that you are running on a [supported platform](https://github.com/llmware-ai/llmware/blob/main/docs/getting_started/platforms.md#platform-support). 3. Learn by example: -- [Fast Start examples](https://www.github.com/llmware-ai/llmware/tree/main/fast_start) - structured set of 6 examples (with no DB installations required) to learn the main concepts of RAG with LLMWare - each example has extensive comments, and a supporting video on Youtube to walk you through it. -- [Getting Started examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Getting_Started) - heavily-annotated examples that review many getting started elements - selecting a database, loading sample files, working with libraries, and how to use the Model Catalog. -- [Use Case examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases) - longer examples that integrate several components of LLMWare to provide a framework for a solution for common use case patterns. -- Dive into specific area of interest - [Parsing](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing) - [Models](https://www.github.com/llmware-ai/llmware/tree/main/examples/Models) - [Prompts](https://www.github.com/llmware-ai/llmware/tree/main/examples/Models) - [Agents](https://www.github.com/llmware-ai/llmware/tree/main/examples/SLIM-Agents) - and many more ... 4. We provide extensive [sample files](https://www.github.com/llmware-ai/tree/main/examples/Getting_Started/loading_sample_files.py) integrated into the examples, so you can copy-paste-run, and quickly validate that the installation is set up correctly, and to start seeing key classes and methods in action. We would encourage you to start with the 'out of the box' example first, and then use the example as the launching point for inserting your documents, models, queries, and workflows. 5. Learn by watching: check out the [LLMWare Youtube channel](https://www.youtube.com/@llmware). 6. Share with the community: join us on [Discord](https://discord.gg/MhZn5Nc39h). [Install llmware](#install-llmware){: .btn .btn-primary .fs-5 .mb-4 .mb-md-0 .mr-2 } [Common Setup & Configuration Items](#platform-support){: .btn .fs-5 .mb-4 .mb-md-0 } [Architecture](architecture.md/#llmware-architecture){: .btn .fs-5 .mb-4 .mb-md-0 } [View llmware on GitHub](https://www.github.com/llmware-ai/llmware/tree/main){: .btn .fs-5 .mb-4 .mb-md-0 } [Open an Issue on GitHub](https://www.github.com/llmware-ai/llmware/issues){: .btn .fs-5 .mb-4 .mb-md-0 } # Install llmware ___ **Using Pip Install** - Installing llmware is easy: `pip3 install llmware` - If you prefer, we also provide a set of recent wheels in the [wheel archives](https://www.github.com/llmware-ai/llmware/tree/main/wheel_archives) in this repository, which can be downloaded individually and used as follows: ```bash pip3 install llmware-0.2.12-py3-none-any.wheel ```` - We generally keep the main branch of this repository current with all changes, but we only publish new wheels to PyPi approximately once per week ___ ___ **Cloning the Repository** - If you prefer to clone the repository: ```bash git clone git@github.com:llmware-ai/llmware.git ``` - The llmware package is contained entirely in the /llmware folder path, so you should be able to drop this folder (with all of its contents) into a project tree, and use the llmware module essentially the same as a pip install. - Please ensure that you are capturing and updating the /llmware/lib folder, which includes required compiled shared libraries. If you prefer, you can keep only those libs required for your OS platform. - After cloning the repo, we provide a short 'welcome to llmware' automation script, which can be used to install the projects requirements (from llmware/requirements.txt), install several optional dependencies that are commonly used in examples, copy several good 'getting started' examples into the root folder, and then run a 'welcome_example.py' script to get started using our models. To use the "welcome to llmware" script: Windows: ```bash .\welcome_to_llmware_windows.sh ``` Mac/Linux: ```bash sh ./welcome_to_llmware.sh ``` # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
---
--- --- --- layout: default title: Installation parent: Getting Started nav_order: 2 permalink: /getting_started/installation --- ## Installation Set up `pip3 install llmware` or, if you prefer clone the github repo locally, e.g., `git clone git@github.com:llmware-ai/llmware.git `. Platforms: - Mac M1/M2/M3, Windows, Linux (Ubuntu 20 or Ubuntu 22 preferred) - RAM: 16 GB minimum - Python 3.9, 3.10, 3.11 (note: not supported on 3.12 - coming soon!) - Pull the latest version of llmware == 0.2.11 (as of end of April 2024) - Please note that we have updated the examples from the original versions, to use new features in llmware, so there may be minor differences with the videos, which are annotated in the comments in each example. ## Wheel Archive - If you prefer, we also provide a set of recent wheels in the [wheel archives](https://www.github.com/llmware-ai/llmware/tree/main/wheel_archives) in this repository, which can be downloaded individually and used as follows: ```bash pip3 install llmware-0.2.12-py3-none-any.wheel ```` - We generally keep the main branch of this repository current with all changes, but we only publish new wheels to PyPi approximately once per week ___ ___ **Cloning the Repository** - If you prefer to clone the repository: ```bash git clone git@github.com:llmware-ai/llmware.git ``` - The llmware package is contained entirely in the /llmware folder path, so you should be able to drop this folder (with all of its contents) into a project tree, and use the llmware module essentially the same as a pip install. - Please ensure that you are capturing and updating the /llmware/lib folder, which includes required compiled shared libraries. If you prefer, you can keep only those libs required for your OS platform. - After cloning the repo, we provide a short 'welcome to llmware' automation script, which can be used to install the projects requirements (from llmware/requirements.txt), install several optional dependencies that are commonly used in examples, copy several good 'getting started' examples into the root folder, and then run a 'welcome_example.py' script to get started using our models. To use the "welcome to llmware" script: Windows: ```bash .\welcome_to_llmware_windows.sh ``` Mac/Linux: ```bash sh ./welcome_to_llmware.sh ``` # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
---
--- --- --- layout: default title: Overview parent: Getting Started nav_order: 1 permalink: /getting_started/overview --- ## Welcome to
  • llmware
## 🧰🛠️🔩Building Enterprise RAG Pipelines with Small, Specialized Models `llmware` provides a unified framework for building LLM-based applications (e.g, RAG, Agents), using small, specialized models that can be deployed privately, integrated with enterprise knowledge sources safely and securely, and cost-effectively tuned and adapted for any business process. `llmware` has two main components: 1. **RAG Pipeline** - integrated components for the full lifecycle of connecting knowledge sources to generative AI models; and 2. **50+ small, specialized models** fine-tuned for key tasks in enterprise process automation, including fact-based question-answering, classification, summarization, and extraction. By bringing together both of these components, along with integrating leading open source models and underlying technologies, `llmware` offers a comprehensive set of tools to rapidly build knowledge-based enterprise LLM applications. Most of our examples can be run without a GPU server - get started right away on your laptop. ## 🎯 Key features Writing code with`llmware` is based on a few main concepts:
Model Catalog: Access all models the same way with easy lookup, regardless of underlying implementation. ```python # 150+ Models in Catalog with 50+ RAG-optimized BLING, DRAGON and Industry BERT models # Full support for GGUF, HuggingFace, Sentence Transformers and major API-based models # Easy to extend to add custom models - see examples from llmware.models import ModelCatalog from llmware.prompts import Prompt # all models accessed through the ModelCatalog models = ModelCatalog().list_all_models() # to use any model in the ModelCatalog - "load_model" method and pass the model_name parameter my_model = ModelCatalog().load_model("llmware/bling-phi-3-gguf") output = my_model.inference("what is the future of AI?", add_context="Here is the article to read") # to integrate model into a Prompt prompter = Prompt().load_model("llmware/bling-tiny-llama-v0") response = prompter.prompt_main("what is the future of AI?", context="Insert Sources of information") ```
Library: ingest, organize and index a collection of knowledge at scale - Parse, Text Chunk and Embed. ```python from llmware.library import Library # to parse and text chunk a set of documents (pdf, pptx, docx, xlsx, txt, csv, md, json/jsonl, wav, png, jpg, html) # step 1 - create a library, which is the 'knowledge-base container' construct # - libraries have both text collection (DB) resources, and file resources (e.g., llmware_data/accounts/{library_name}) # - embeddings and queries are run against a library lib = Library().create_new_library("my_library") # step 2 - add_files is the universal ingestion function - point it at a local file folder with mixed file types # - files will be routed by file extension to the correct parser, parsed, text chunked and indexed in text collection DB lib.add_files("/folder/path/to/my/files") # to install an embedding on a library - pick an embedding model and vector_db lib.install_new_embedding(embedding_model_name="mini-lm-sbert", vector_db="milvus", batch_size=500) # to add a second embedding to the same library (mix-and-match models + vector db) lib.install_new_embedding(embedding_model_name="industry-bert-sec", vector_db="chromadb", batch_size=100) # easy to create multiple libraries for different projects and groups finance_lib = Library().create_new_library("finance_q4_2023") finance_lib.add_files("/finance_folder/") hr_lib = Library().create_new_library("hr_policies") hr_lib.add_files("/hr_folder/") # pull library card with key metadata - documents, text chunks, images, tables, embedding record lib_card = Library().get_library_card("my_library") # see all libraries all_my_libs = Library().get_all_library_cards() ```
Query: query libraries with mix of text, semantic, hybrid, metadata, and custom filters. ```python from llmware.retrieval import Query from llmware.library import Library # step 1 - load the previously created library lib = Library().load_library("my_library") # step 2 - create a query object and pass the library q = Query(lib) # step 3 - run lots of different queries (many other options in the examples) # basic text query results1 = q.text_query("text query", result_count=20, exact_mode=False) # semantic query results2 = q.semantic_query("semantic query", result_count=10) # combining a text query restricted to only certain documents in the library and "exact" match to the query results3 = q.text_query_with_document_filter("new query", {"file_name": "selected file name"}, exact_mode=True) # to apply a specific embedding (if multiple on library), pass the names when creating the query object q2 = Query(lib, embedding_model_name="mini_lm_sbert", vector_db="milvus") results4 = q2.semantic_query("new semantic query") ```
Prompt with Sources: the easiest way to combine knowledge retrieval with a LLM inference. ```python from llmware.prompts import Prompt from llmware.retrieval import Query from llmware.library import Library # build a prompt prompter = Prompt().load_model("llmware/bling-tiny-llama-v0") # add a file -> file is parsed, text chunked, filtered by query, and then packaged as model-ready context, # including in batches, if needed, to fit the model context window source = prompter.add_source_document("/folder/to/one/doc/", "filename", query="fast query") # attach query results (from a Query) into a Prompt my_lib = Library().load_library("my_library") results = Query(my_lib).query("my query") source2 = prompter.add_source_query_results(results) # run a new query against a library and load directly into a prompt source3 = prompter.add_source_new_query(my_lib, query="my new query", query_type="semantic", result_count=15) # to run inference with 'prompt with sources' responses = prompter.prompt_with_source("my query") # to run fact-checks - post inference fact_check = prompter.evidence_check_sources(responses) # to view source materials (batched 'model-ready' and attached to prompt) source_materials = prompter.review_sources_summary() # to see the full prompt history prompt_history = prompter.get_current_history() ```
RAG-Optimized Models - 1-7B parameter models designed for RAG workflow integration and running locally. ``` """ This 'Hello World' example demonstrates how to get started using local BLING models with provided context, using both Pytorch and GGUF versions. """ import time from llmware.prompts import Prompt def hello_world_questions(): test_list = [ {"query": "What is the total amount of the invoice?", "answer": "$22,500.00", "context": "Services Vendor Inc. \n100 Elm Street Pleasantville, NY \nTO Alpha Inc. 5900 1st Street " "Los Angeles, CA \nDescription Front End Engineering Service $5000.00 \n Back End Engineering" " Service $7500.00 \n Quality Assurance Manager $10,000.00 \n Total Amount $22,500.00 \n" "Make all checks payable to Services Vendor Inc. Payment is due within 30 days." "If you have any questions concerning this invoice, contact Bia Hermes. " "THANK YOU FOR YOUR BUSINESS! INVOICE INVOICE # 0001 DATE 01/01/2022 FOR Alpha Project P.O. # 1000"}, {"query": "What was the amount of the trade surplus?", "answer": "62.4 billion yen ($416.6 million)", "context": "Japan’s September trade balance swings into surplus, surprising expectations" "Japan recorded a trade surplus of 62.4 billion yen ($416.6 million) for September, " "beating expectations from economists polled by Reuters for a trade deficit of 42.5 " "billion yen. Data from Japan’s customs agency revealed that exports in September " "increased 4.3% year on year, while imports slid 16.3% compared to the same period " "last year. According to FactSet, exports to Asia fell for the ninth straight month, " "which reflected ongoing China weakness. Exports were supported by shipments to " "Western markets, FactSet added. — Lim Hui Jie"}, {"query": "When did the LISP machine market collapse?", "answer": "1987.", "context": "The attendees became the leaders of AI research in the 1960s." " They and their students produced programs that the press described as 'astonishing': " "computers were learning checkers strategies, solving word problems in algebra, " "proving logical theorems and speaking English. By the middle of the 1960s, research in " "the U.S. was heavily funded by the Department of Defense and laboratories had been " "established around the world. Herbert Simon predicted, 'machines will be capable, " "within twenty years, of doing any work a man can do'. Marvin Minsky agreed, writing, " "'within a generation ... the problem of creating 'artificial intelligence' will " "substantially be solved'. They had, however, underestimated the difficulty of the problem. " "Both the U.S. and British governments cut off exploratory research in response " "to the criticism of Sir James Lighthill and ongoing pressure from the US Congress " "to fund more productive projects. Minsky's and Papert's book Perceptrons was understood " "as proving that artificial neural networks approach would never be useful for solving " "real-world tasks, thus discrediting the approach altogether. The 'AI winter', a period " "when obtaining funding for AI projects was difficult, followed. In the early 1980s, " "AI research was revived by the commercial success of expert systems, a form of AI " "program that simulated the knowledge and analytical skills of human experts. By 1985, " "the market for AI had reached over a billion dollars. At the same time, Japan's fifth " "generation computer project inspired the U.S. and British governments to restore funding " "for academic research. However, beginning with the collapse of the Lisp Machine market " "in 1987, AI once again fell into disrepute, and a second, longer-lasting winter began."}, {"query": "What is the current rate on 10-year treasuries?", "answer": "4.58%", "context": "Stocks rallied Friday even after the release of stronger-than-expected U.S. jobs data " "and a major increase in Treasury yields. The Dow Jones Industrial Average gained 195.12 points, " "or 0.76%, to close at 31,419.58. The S&P 500 added 1.59% at 4,008.50. The tech-heavy " "Nasdaq Composite rose 1.35%, closing at 12,299.68. The U.S. economy added 438,000 jobs in " "August, the Labor Department said. Economists polled by Dow Jones expected 273,000 " "jobs. However, wages rose less than expected last month. Stocks posted a stunning " "turnaround on Friday, after initially falling on the stronger-than-expected jobs report. " "At its session low, the Dow had fallen as much as 198 points; it surged by more than " "500 points at the height of the rally. The Nasdaq and the S&P 500 slid by 0.8% during " "their lowest points in the day. Traders were unclear of the reason for the intraday " "reversal. Some noted it could be the softer wage number in the jobs report that made " "investors rethink their earlier bearish stance. Others noted the pullback in yields from " "the day’s highs. Part of the rally may just be to do a market that had gotten extremely " "oversold with the S&P 500 at one point this week down more than 9% from its high earlier " "this year. Yields initially surged after the report, with the 10-year Treasury rate trading " "near its highest level in 14 years. The benchmark rate later eased from those levels, but " "was still up around 6 basis points at 4.58%. 'We’re seeing a little bit of a give back " "in yields from where we were around 4.8%. [With] them pulling back a bit, I think that’s " "helping the stock market,' said Margaret Jones, chief investment officer at Vibrant Industries " "Capital Advisors. 'We’ve had a lot of weakness in the market in recent weeks, and potentially " "some oversold conditions.'"}, {"query": "Is the expected gross margin greater than 70%?", "answer": "Yes, between 71.5% and 72.%", "context": "Outlook NVIDIA’s outlook for the third quarter of fiscal 2024 is as follows:" "Revenue is expected to be $16.00 billion, plus or minus 2%. GAAP and non-GAAP " "gross margins are expected to be 71.5% and 72.5%, respectively, plus or minus " "50 basis points. GAAP and non-GAAP operating expenses are expected to be " "approximately $2.95 billion and $2.00 billion, respectively. GAAP and non-GAAP " "other income and expense are expected to be an income of approximately $100 " "million, excluding gains and losses from non-affiliated investments. GAAP and " "non-GAAP tax rates are expected to be 14.5%, plus or minus 1%, excluding any discrete items." "Highlights NVIDIA achieved progress since its previous earnings announcement " "in these areas: Data Center Second-quarter revenue was a record $10.32 billion, " "up 141% from the previous quarter and up 171% from a year ago. Announced that the " "NVIDIA® GH200 Grace™ Hopper™ Superchip for complex AI and HPC workloads is shipping " "this quarter, with a second-generation version with HBM3e memory expected to ship " "in Q2 of calendar 2024. "}, {"query": "What is Bank of America's rating on Target?", "answer": "Buy", "context": "Here are some of the tickers on my radar for Thursday, Oct. 12, taken directly from " "my reporter’s notebook: It’s the one-year anniversary of the S&P 500′s bear market bottom " "of 3,577. Since then, as of Wednesday’s close of 4,376, the broad market index " "soared more than 22%. Hotter than expected September consumer price index, consumer " "inflation. The Social Security Administration issues announced a 3.2% cost-of-living " "adjustment for 2024. Chipotle Mexican Grill (CMG) plans price increases. Pricing power. " "Cites consumer price index showing sticky retail inflation for the fourth time " "in two years. Bank of America upgrades Target (TGT) to buy from neutral. Cites " "risk/reward from depressed levels. Traffic could improve. Gross margin upside. " "Merchandising better. Freight and transportation better. Target to report quarter " "next month. In retail, the CNBC Investing Club portfolio owns TJX Companies (TJX), " "the off-price juggernaut behind T.J. Maxx, Marshalls and HomeGoods. Goldman Sachs " "tactical buy trades on Club names Wells Fargo (WFC), which reports quarter Friday, " "Humana (HUM) and Nvidia (NVDA). BofA initiates Snowflake (SNOW) with a buy rating." "If you like this story, sign up for Jim Cramer’s Top 10 Morning Thoughts on the " "Market email newsletter for free. Barclays cuts price targets on consumer products: " "UTZ Brands (UTZ) to $16 per share from $17. Kraft Heinz (KHC) to $36 per share from " "$38. Cyclical drag. J.M. Smucker (SJM) to $129 from $160. Secular headwinds. " "Coca-Cola (KO) to $59 from $70. Barclays cut PTs on housing-related stocks: Toll Brothers" "(TOL) to $74 per share from $82. Keeps underweight. Lowers Trex (TREX) and Azek" "(AZEK), too. Goldman Sachs (GS) announces sale of fintech platform and warns on " "third quarter of 19-cent per share drag on earnings. The buyer: investors led by " "private equity firm Sixth Street. Exiting a mistake. Rise in consumer engagement for " "Spotify (SPOT), says Morgan Stanley. The analysts hike price target to $190 per share " "from $185. Keeps overweight (buy) rating. JPMorgan loves elf Beauty (ELF). Keeps " "overweight (buy) rating but lowers price target to $139 per share from $150. " "Sees “still challenging” environment into third-quarter print. The Club owns shares " "in high-end beauty company Estee Lauder (EL). Barclays upgrades First Solar (FSLR) " "to overweight from equal weight (buy from hold) but lowers price target to $224 per " "share from $230. Risk reward upgrade. Best visibility of utility scale names."}, {"query": "What was the rate of decline in 3rd quarter sales?", "answer": "20% year-on-year.", "context": "Nokia said it would cut up to 14,000 jobs as part of a cost cutting plan following " "third quarter earnings that plunged. The Finnish telecommunications giant said that " "it will reduce its cost base and increase operation efficiency to “address the " "challenging market environment. The substantial layoffs come after Nokia reported " "third-quarter net sales declined 20% year-on-year to 4.98 billion euros. Profit over " "the period plunged by 69% year-on-year to 133 million euros."}, {"query": "What is a list of the key points?", "answer": "•Stocks rallied on Friday with stronger-than-expected U.S jobs data and increase in " "Treasury yields;\n•Dow Jones gained 195.12 points;\n•S&P 500 added 1.59%;\n•Nasdaq Composite rose " "1.35%;\n•U.S. economy added 438,000 jobs in August, better than the 273,000 expected;\n" "•10-year Treasury rate trading near the highest level in 14 years at 4.58%.", "context": "Stocks rallied Friday even after the release of stronger-than-expected U.S. jobs data " "and a major increase in Treasury yields. The Dow Jones Industrial Average gained 195.12 points, " "or 0.76%, to close at 31,419.58. The S&P 500 added 1.59% at 4,008.50. The tech-heavy " "Nasdaq Composite rose 1.35%, closing at 12,299.68. The U.S. economy added 438,000 jobs in " "August, the Labor Department said. Economists polled by Dow Jones expected 273,000 " "jobs. However, wages rose less than expected last month. Stocks posted a stunning " "turnaround on Friday, after initially falling on the stronger-than-expected jobs report. " "At its session low, the Dow had fallen as much as 198 points; it surged by more than " "500 points at the height of the rally. The Nasdaq and the S&P 500 slid by 0.8% during " "their lowest points in the day. Traders were unclear of the reason for the intraday " "reversal. Some noted it could be the softer wage number in the jobs report that made " "investors rethink their earlier bearish stance. Others noted the pullback in yields from " "the day’s highs. Part of the rally may just be to do a market that had gotten extremely " "oversold with the S&P 500 at one point this week down more than 9% from its high earlier " "this year. Yields initially surged after the report, with the 10-year Treasury rate trading " "near its highest level in 14 years. The benchmark rate later eased from those levels, but " "was still up around 6 basis points at 4.58%. 'We’re seeing a little bit of a give back " "in yields from where we were around 4.8%. [With] them pulling back a bit, I think that’s " "helping the stock market,' said Margaret Jones, chief investment officer at Vibrant Industries " "Capital Advisors. 'We’ve had a lot of weakness in the market in recent weeks, and potentially " "some oversold conditions.'"} ] return test_list # this is the main script to be run def bling_meets_llmware_hello_world (model_name): t0 = time.time() # load the questions test_list = hello_world_questions() print(f"\n > Loading Model: {model_name}...") # load the model prompter = Prompt().load_model(model_name) t1 = time.time() print(f"\n > Model {model_name} load time: {t1-t0} seconds") for i, entries in enumerate(test_list): print(f"\n{i+1}. Query: {entries['query']}") # run the prompt output = prompter.prompt_main(entries["query"],context=entries["context"] , prompt_name="default_with_context",temperature=0.30) # print out the results llm_response = output["llm_response"].strip("\n") print(f"LLM Response: {llm_response}") print(f"Gold Answer: {entries['answer']}") print(f"LLM Usage: {output['usage']}") t2 = time.time() print(f"\nTotal processing time: {t2-t1} seconds") return 0 if __name__ == "__main__": # list of 'rag-instruct' laptop-ready small bling models on HuggingFace pytorch_models = ["llmware/bling-1b-0.1", # most popular "llmware/bling-tiny-llama-v0", # fastest "llmware/bling-1.4b-0.1", "llmware/bling-falcon-1b-0.1", "llmware/bling-cerebras-1.3b-0.1", "llmware/bling-sheared-llama-1.3b-0.1", "llmware/bling-sheared-llama-2.7b-0.1", "llmware/bling-red-pajamas-3b-0.1", "llmware/bling-stable-lm-3b-4e1t-v0", "llmware/bling-phi-3" # most accurate (and newest) ] # Quantized GGUF versions generally load faster and run nicely on a laptop with at least 16 GB of RAM gguf_models = ["bling-phi-3-gguf", "bling-stablelm-3b-tool", "dragon-llama-answer-tool", "dragon-yi-answer-tool", "dragon-mistral-answer-tool"] # try model from either pytorch or gguf model list # the newest (and most accurate) is 'bling-phi-3-gguf' bling_meets_llmware_hello_world(gguf_models[0] # check out the model card on Huggingface for RAG benchmark test performance results and other useful information ```
Simple-to-Scale Database Options - integrated data stores from laptop to parallelized cluster. ```python from llmware.configs import LLMWareConfig # to set the collection database - mongo, sqlite, postgres LLMWareConfig().set_active_db("mongo") # to set the vector database (or declare when installing) # --options: milvus, pg_vector (postgres), redis, qdrant, faiss, pinecone, mongo atlas LLMWareConfig().set_vector_db("milvus") # for fast start - no installations required LLMWareConfig().set_active_db("sqlite") LLMWareConfig().set_vector_db("chromadb") # try also faiss and lancedb # for single postgres deployment LLMWareConfig().set_active_db("postgres") LLMWareConfig().set_vector_db("postgres") # to install mongo, milvus, postgres - see the docker-compose scripts as well as examples ```
🔥 Agents with Function Calls and SLIM Models 🔥 ```python from llmware.agents import LLMfx text = ("Tesla stock fell 8% in premarket trading after reporting fourth-quarter revenue and profit that " "missed analysts’ estimates. The electric vehicle company also warned that vehicle volume growth in " "2024 'may be notably lower' than last year’s growth rate. Automotive revenue, meanwhile, increased " "just 1% from a year earlier, partly because the EVs were selling for less than they had in the past. " "Tesla implemented steep price cuts in the second half of the year around the world. In a Wednesday " "presentation, the company warned investors that it’s 'currently between two major growth waves.'") # create an agent using LLMfx class agent = LLMfx() # load text to process agent.load_work(text) # load 'models' as 'tools' to be used in analysis process agent.load_tool("sentiment") agent.load_tool("extract") agent.load_tool("topics") agent.load_tool("boolean") # run function calls using different tools agent.sentiment() agent.topics() agent.extract(params=["company"]) agent.extract(params=["automotive revenue growth"]) agent.xsum() agent.boolean(params=["is 2024 growth expected to be strong? (explain)"]) # at end of processing, show the report that was automatically aggregated by key report = agent.show_report() # displays a summary of the activity in the process activity_summary = agent.activity_summary() # list of the responses gathered for i, entries in enumerate(agent.response_list): print("update: response analysis: ", i, entries) output = {"report": report, "activity_summary": activity_summary, "journal": agent.journal} ```
🚀 Start coding - Quick Start for RAG 🚀 ```python # This example illustrates a simple contract analysis # using a RAG-optimized LLM running locally import os import re from llmware.prompts import Prompt, HumanInTheLoop from llmware.setup import Setup from llmware.configs import LLMWareConfig def contract_analysis_on_laptop (model_name): # In this scenario, we will: # -- download a set of sample contract files # -- create a Prompt and load a BLING LLM model # -- parse each contract, extract the relevant passages, and pass questions to a local LLM # Main loop - Iterate thru each contract: # # 1. parse the document in memory (convert from PDF file into text chunks with metadata) # 2. filter the parsed text chunks with a "topic" (e.g., "governing law") to extract relevant passages # 3. package and assemble the text chunks into a model-ready context # 4. ask three key questions for each contract to the LLM # 5. print to the screen # 6. save the results in both json and csv for furthe processing and review. # Load the llmware sample files print (f"\n > Loading the llmware sample files...") sample_files_path = Setup().load_sample_files() contracts_path = os.path.join(sample_files_path,"Agreements") # Query list - these are the 3 main topics and questions that we would like the LLM to analyze for each contract query_list = {"executive employment agreement": "What are the name of the two parties?", "base salary": "What is the executive's base salary?", "vacation": "How many vacation days will the executive receive?"} # Load the selected model by name that was passed into the function print (f"\n > Loading model {model_name}...") prompter = Prompt().load_model(model_name, temperature=0.0, sample=False) # Main loop for i, contract in enumerate(os.listdir(contracts_path)): # excluding Mac file artifact (annoying, but fact of life in demos) if contract != ".DS_Store": print("\nAnalyzing contract: ", str(i+1), contract) print("LLM Responses:") for key, value in query_list.items(): # step 1 + 2 + 3 above - contract is parsed, text-chunked, filtered by topic key, # ... and then packaged into the prompt source = prompter.add_source_document(contracts_path, contract, query=key) # step 4 above - calling the LLM with 'source' information already packaged into the prompt responses = prompter.prompt_with_source(value, prompt_name="default_with_context") # step 5 above - print out to screen for r, response in enumerate(responses): print(key, ":", re.sub("[\n]"," ", response["llm_response"]).strip()) # We're done with this contract, clear the source from the prompt prompter.clear_source_materials() # step 6 above - saving the analysis to jsonl and csv # Save jsonl report to jsonl to /prompt_history folder print("\nPrompt state saved at: ", os.path.join(LLMWareConfig.get_prompt_path(),prompter.prompt_id)) prompter.save_state() # Save csv report that includes the model, response, prompt, and evidence for human-in-the-loop review csv_output = HumanInTheLoop(prompter).export_current_interaction_to_csv() print("csv output saved at: ", csv_output) if __name__ == "__main__": # use local cpu model - try the newest - RAG finetune of Phi-3 quantized and packaged in GGUF model = "bling-phi-3-gguf" contract_analysis_on_laptop(model) ```
# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
---
--- --- --- layout: default title: Platforms Supported parent: Getting Started nav_order: 5 permalink: /getting_started/platforms --- ___ # Platform Support ___ **Platform Supported** - **Python 3.9+** (note that we just added support for 3.12 starting in llmware version 0.2.12) - **System RAM**: recommended 16 GB RAM minimum (to run most local models on CPU) - **OS Supported**: Mac OS M1/M2/M3, Windows, Linux Ubuntu 20/22. We regularly build and test on Windows and Linux platforms with and without CUDA drivers. - **Deprecated OS**: Linux Aarch64 (0.2.6) and Mac x86 (0.2.10) - most features of llmware should work on these platforms, but new features integrated since those versions will not be available. If you have a particular need to work on one of these platforms, please raise an Issue, and we can work with you to try to find a solution. - **Linux**: we build to GLIBC 2.31+ - so Linux versions with older GLIBC drivers will generally not work (e.g., Ubuntu 18). To check the GLIBC version, you can use the command `ldd --version`. If it is 2.31 or any higher version, it should work. ___ ___ **Database** - LLMWare is an enterprise-grade data pipeline designed for persistent storage of key artifacts throughout the pipeline. We provide several options to parse 'in-memory' and write to jsonl files, but most of the functionality of LLMWare assumes that a persistent scalable data store will be used. - There are three different types of data storage used in LLMWare: 1. **Text Collection database** - all of the LLMWare parsers, by default, parse and text chunk unstructured content (and associated metadata) into one of three databases used for text collections, organized in Libraries - **MongoDB**, **Postgres** and **SQLite**. 2. **Vector database** - for storing and retrieving semantic embedding vectors, LLMWare supports the following vector databases - Milvus, PG Vector / Postgres, Qdrant, ChromaDB, Redis, Neo4J, Lance DB, Mongo-Atlas, Pinecone and FAISS. 3. **SQL Tables database** - for easily integrating table-based data into LLM workflows through the CustomTable class and for using in conjunction with a Text-2-SQL workflow - supported on Postgres and SQLite. - **Fast Start** option: you can start using SQLite locally without any separate installation by setting `LLMWareConfig.set_active_db("sqlite")` as shown in [configure_db_example](https://www.github.com/llmware-ai/llmware/blob/main/examples/Getting_Started/configure_db.py). For vector embedding examples, you can use ChromaDB, LanceDB or FAISS - all of which provide no-install options - just start using. - **Install DB dependencies**: we provide a number of Docker-Compose scripts which can be used, or follow install instructions provided by the database - generally easiest to install locally with Docker. **LLMWare File Storage** - llmware stores a variety of artifacts during its operation locally in the /llmware_data path, which can be found as follows: ```python from llmware.configs import LLMWareConfig llmware_fp = LLMWareConfig().get_llmware_path() print("llmware_data path: ", llmware_fp) ``` - to change the llmware path, we can change both the 'home' path, which is the main filepath, and the 'llmware_data' path name as follows: ```python from llmware.configs import LLMWareConfig # changing the llmware home path - change home + llmware_path_name LLMWareConfig().set_home("/my/new/local/home/path") LLMWareConfig().set_llmware_path_name("llmware_data2") # check the new llmware home path llmware_fp = LLMWareConfig().get_llmware_path() print("updated llmware path: ", llmware_fp) ``` ___ ___ **Local Models** - LLMWare treats open source and locally deployed models as "first class citizens" with all classes, methods and examples designed to work first with smaller, specialized, locally-deployed models. - By default, most models are pulled from public HuggingFace repositories, and cached locally. LLMWare will store all models locally at the /llmware_data/model_repo path, with all assets found in a folder tree with the models name. - If a Pytorch model is pulled from HuggingFace, then it will appear in the default HuggingFace /.cache path. - To view the local model path: ```python from llmware.configs import LLMWareConfig model_fp = LLMWareConfig().get_model_repo_path() print("model repo path: ", model_fp) ``` # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
---
--- --- --- layout: default title: Working with Docker parent: Getting Started nav_order: 6 permalink: /getting_started/working_with_docker --- # Working with Docker Scripts This section is a short guide on setting up a Linux environment with Docker and running LLMWare examples with different database systems. ## 1. Python and Pip Python should come installed with your Linux environment. To install Pip, run the following: ``` sudo apt-get update sudo apt-get -y install python3-pip pip3 install --upgrade pip ``` ## 2. Docker and Docker Compose The latest versions of Docker and Docker Comopse should be installed to be able to use the Docker Compose files in the LLMWare repository. Instructions to install Docker: https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-on-ubuntu-20-04 (Steps 1-2) Note: Step 1 is necessary, Step 2 is optional but we highly recommend it. Instructions to install Docker Compose: https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-compose-on-ubuntu-20-04 (Step 1) Note: replace the URL in the `curl` command with the latest download from https://github.com/docker/compose/releases. Check that Docker is running on your system: ``` sudo systemctl status docker ``` ## 3. Running Docker Compose files `cd` into the repository and ensure that you can see files of the format `docker-compose-database-name.yaml`. To run a Compose file: ``` docker-compose -f docker-compose-database-name.yaml up -d ``` Check that the container is running: ``` docker ps ``` Note: this will list only the the containers that are currently running, add the `-a` flag (`docker ps -a`) to list all containers (even those that are stopped). ## 4. Test with Examples The Compose files currently support 6 database systems: - Mongo - Postgres/PG Vector - Neo4j - Milvus - Qdrant - Redis Note: PG Vector is an alias for Postgres and is used for vector embeddings. 1. Mongo and Postgres are used as the active database to store library text collections. 2. PG Vector, Neo4j, Milvus, Qdrant and Redis are used as the vector database to store vector embeddings. To test that the containers are working as intended, you can modify an example provided in the LLMWare repository. The simplest example to do this is `fast_start/example-2-build_embeddings.py`. Open the file in an editor. 1. Change the argument passed in as the active database on line 128 to an appropriate active database (Mongo or Postgres). 2. Change the argument passed in as the vector database on line 138 to an appropriate vector database (PG Vector, Neo4j, Milvus, Qdrant or Redis). Run the example with these changes, and you should see updates in the terminal indicating that the embeddings are being generated correctly. Note: It is possible that you will see an error: ``` llmware.exceptions.EmbeddingModelNotFoundException: Embedding model for 'example2_library' could not be located ``` In this case, use a unique name for the library name passed in on line 147. ## 5. Stopping/Deleting Containers To stop a container, run: ``` docker stop container_ID_OR_container_name ``` To delete a container, run: ``` docker rm container_ID_OR_container_name ``` Note: passing in either the ID or the name will work. To find the ID or name of a container, run: ``` docker ps -a ``` --- # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
---
--- --- --- layout: default title: Home | llmware nav_order: 1 description: llmware is an integrated framework with over 50+ models for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows. permalink: / --- ## Welcome to
  • llmware
## 🧰🛠️🔩The Ultimate Toolkit for Enterprise RAG Pipelines with Small, Specialized Models From quickly building POCs to scalable LLM Apps for the enterprise, LLMWare is packed with all the tools you need. `llmware` is an integrated framework with over 50+ small, specialized, open source models for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows. This project provides a comprehensive set of tools that anyone can use - from a beginner to the most sophisticated AI developer - to rapidly build industrial-grade, knowledge-based enterprise LLM applications. Our specific focus is on making it easy to integrate open source small specialized models and connecting enterprise knowledge safely and securely. ## Getting Started 1. Install llmware - `pip3 install llmware` 2. Make sure that you are running on a [supported platform](https://www.github.com/llmware-ai/llmware/tree/main/docs/getting_started/platforms.md#platform-support). 3. Learn by example: -- [Fast Start examples](https://www.github.com/llmware-ai/llmware/tree/main/fast_start) - structured set of 6 examples (with no DB installations required) to learn the main concepts of RAG with LLMWare - each example has extensive comments, and a supporting video on Youtube to walk you through it. -- [Getting Started examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Getting_Started) - heavily-annotated examples that review many getting started elements - selecting a database, loading sample files, working with libraries, and how to use the Model Catalog. -- [Use Case examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases) - longer examples that integrate several components of LLMWare to provide a framework for a solution for common use case patterns. -- Dive into specific area of interest - [Parsing](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing) - [Models](https://www.github.com/llmware-ai/llmware/tree/main/examples/Models) - [Prompts](https://www.github.com/llmware-ai/llmware/tree/main/examples/Models) - [Agents](https://www.github.com/llmware-ai/llmware/tree/main/examples/SLIM-Agents) - and many more ... 4. We provide extensive [sample files](https://www.github.com/llmware-ai/tree/main/examples/Getting_Started/loading_sample_files.py) integrated into the examples, so you can copy-paste-run, and quickly validate that the installation is set up correctly, and to start seeing key classes and methods in action. We would encourage you to start with the 'out of the box' example first, and then use the example as the launching point for inserting your documents, models, queries, and workflows. 5. Learn by watching: check out the [LLMWare Youtube channel](https://www.youtube.com/@llmware). 6. Share with the community: join us on [Discord](https://discord.gg/MhZn5Nc39h). [Install llmware](#install-llmware){: .btn .btn-primary .fs-5 .mb-4 .mb-md-0 .mr-2 } [Common Setup & Configuration Items](#platform-support){: .btn .fs-5 .mb-4 .mb-md-0 } [Architecture](architecture.md/#llmware-architecture){: .btn .fs-5 .mb-4 .mb-md-0 } [View llmware on GitHub](https://www.github.com/llmware-ai/llmware/tree/main){: .btn .fs-5 .mb-4 .mb-md-0 } [Open an Issue on GitHub](https://www.github.com/llmware-ai/llmware/issues){: .btn .fs-5 .mb-4 .mb-md-0 } # Install llmware ___ **Using Pip Install** - Installing llmware is easy: `pip3 install llmware` - If you prefer, we also provide a set of recent wheels in the [wheel archives](https://www.github.com/llmware-ai/llmware/tree/main/wheel_archives) in this repository, which can be downloaded individually and used as follows: ```bash pip3 install llmware-0.2.12-py3-none-any.wheel ```` - We generally keep the main branch of this repository current with all changes, but we only publish new wheels to PyPi approximately once per week ___ ___ **Cloning the Repository** - If you prefer to clone the repository: ```bash git clone git@github.com:llmware-ai/llmware.git ``` - The llmware package is contained entirely in the /llmware folder path, so you should be able to drop this folder (with all of its contents) into a project tree, and use the llmware module essentially the same as a pip install. - Please ensure that you are capturing and updating the /llmware/lib folder, which includes required compiled shared libraries. If you prefer, you can keep only those libs required for your OS platform. - After cloning the repo, we provide a short 'welcome to llmware' automation script, which can be used to install the projects requirements (from llmware/requirements.txt), install several optional dependencies that are commonly used in examples, copy several good 'getting started' examples into the root folder, and then run a 'welcome_example.py' script to get started using our models. To use the "welcome to llmware" script: Windows: ```bash .\welcome_to_llmware_windows.sh ``` Mac/Linux: ```bash sh ./welcome_to_llmware.sh ``` # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
---
--- --- --- layout: default title: Advanced RAG parent: Learn nav_order: 4 description: overview of the major modules and classes of LLMWare permalink: /learn/advanced_techniques_for_rag --- llmware Youtube Video Channel --- **Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples. Check back often as this list is always growing ... 🎬 **Advanced RAG Techniques ** - [Best Small RAG Model - Bling-Phi-3](https://youtu.be/cViMonCAeSc?si=L6jX0sRdZAmKtRcz) - [Agent Automation with Web Services for Financial Research](https://youtu.be/l0jzsg1_Ik0?si=oBGtALHLplouY9x2) - [Voice Transcription and Automated Analysis of Greatest Speeches Dataset](https://youtu.be/5y0ez5ZBpPE?si=PAaCIqYou8nCGxYG) - [Are you prompting wrong for RAG - Stochastic Sampling-Part I](https://youtu.be/7oMTGhSKuNY?si=_KSjuBnqArvWzYbx) - [Are you prompting wrong for RAG - Stochastic Sampling-Part II- Code Experiments](https://youtu.be/iXp1tj-pPjM?si=3ZeMgipY0vJDHIMY) - [Text2SQL Intro](https://youtu.be/BKZ6kO2XxNo?si=tXGt63pvrp_rOlIP) - [Pop up LLMWare Inference Server](https://www.youtube.com/watch?v=qiEmLnSRDUA&t=20s) - [Hardest Problem in RAG - handling 'Not Found'](https://youtu.be/slDeF7bYuv0?si=j1nkdwdGr5sgvUtK) # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
--- --- --- --- layout: default title: Core RAG Scenarios Running Locally parent: Learn nav_order: 2 description: overview of the major modules and classes of LLMWare permalink: /learn/core_rag_scenarios_running_locally --- Core RAG Scenarios Run Locally --- **Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples. Check back often as this list is always growing ... 🎬 **Core RAG Scenarios** - [Use small LLMs for RAG for Contract Analysis (feat. LLMWare)](https://www.youtube.com/watch?v=8aV5p3tErP0) - [Invoice Processing with LLMware](https://www.youtube.com/watch?v=VHZSaBBG-Bo&t=10s) - [Evaluate LLMs for RAG with LLMWare](https://www.youtube.com/watch?v=s0KWqYg5Buk&t=105s) - [Fast Start to RAG with LLMWare Open Source Library](https://www.youtube.com/watch?v=0naqpH93eEU) - [Use Retrieval Augmented Generation (RAG) without a Database](https://www.youtube.com/watch?v=tAGz6yR14lw) - [Best Small RAG Model - Bling-Phi-3](https://youtu.be/cViMonCAeSc?si=L6jX0sRdZAmKtRcz) - [RAG with BLING on your laptop](https://www.youtube.com/watch?v=JjgqOZ2v5oU) - [DRAGON-7B-Models](https://www.youtube.com/watch?v=d_u7VaKu6Qk&t=37s) # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
--- --- --- --- layout: default title: Voice Transcription with Whisper CPP parent: Learn nav_order: 6 description: overview of the major modules and classes of LLMWare permalink: /learn/integrated_voice_transcription_with_whisper_cpp --- Integrated Voice Transcription with Whisper CPP --- **Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples. Check back often as this list is always growing ... 🎬 **Using Whisper CPP Models** - [Getting Started with Whisper.CPP](https://youtu.be/YG5u5AOU9MQ?si=5xQYZCILPSiR8n4s) - [Voice Transcription and Automated Analysis of Greatest Speeches Dataset](https://youtu.be/5y0ez5ZBpPE?si=PAaCIqYou8nCGxYG) # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
--- --- --- --- layout: default title: Learn nav_order: 4 has_children: true description: key learning resources permalink: /learn --- Learn: Youtube Video Series --- **Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples. Check back often as this list is always growing ... 🎬 **Some of our most recent videos** - [Best Small RAG Model - Bling-Phi-3](https://youtu.be/cViMonCAeSc?si=L6jX0sRdZAmKtRcz) - [Agent Automation with Web Services for Financial Research](https://youtu.be/l0jzsg1_Ik0?si=oBGtALHLplouY9x2) - [Voice Transcription and Automated Analysis of Greatest Speeches Dataset](https://youtu.be/5y0ez5ZBpPE?si=PAaCIqYou8nCGxYG) - [Are you prompting wrong for RAG - Stochastic Sampling-Part I](https://youtu.be/7oMTGhSKuNY?si=_KSjuBnqArvWzYbx) - [Are you prompting wrong for RAG - Stochastic Sampling-Part II- Code Experiments](https://youtu.be/iXp1tj-pPjM?si=3ZeMgipY0vJDHIMY) 🎬 **Using Agents, Function Calls and SLIM models** - [SLIMS Playlist](https://youtube.com/playlist?list=PL1-dn33KwsmAHWCWK6YjZrzicQ2yR6W8T&si=TSFGqQ3ObOO5vDde) - [Agent-based Complex Research Analysis](https://youtu.be/y4WvwHqRR60?si=jX3KCrKcYkM95boe) - [Getting Started with SLIMs (with code)](https://youtu.be/aWZFrTDmMPc?si=lmo98_quo_2Hrq0C) - [SLIM Models Intro](https://www.youtube.com/watch?v=cQfdaTcmBpY) - [Text2SQL Intro](https://youtu.be/BKZ6kO2XxNo?si=tXGt63pvrp_rOlIP) - [Pop up LLMWare Inference Server](https://www.youtube.com/watch?v=qiEmLnSRDUA&t=20s) - [Hardest Problem in RAG - handling 'Not Found'](https://youtu.be/slDeF7bYuv0?si=j1nkdwdGr5sgvUtK) - [Extract Information from Earnings Releases](https://youtu.be/d6HFfyDk4YE?si=VmnIiWFmgBtR4DxS) - [Summary Function Calls](https://youtu.be/yNg_KH5cPSk?si=Yl94tp_vKA8e7eT7) - [Boolean Yes-No Function Calls](https://youtu.be/jZQZMMqAJXs?si=lU4YVI0H0tfc9k6e) - [Autogenerate Topics, Tags and NER](https://youtu.be/N6oOxuyDsC4?si=vo2Fd8VG5xTbH4SD) 🎬 **Using GGUF Models** - [Using LM Studio Models](https://www.youtube.com/watch?v=h2FDjUyvsKE) - [Using Ollama Models](https://www.youtube.com/watch?v=qITahpVDuV0) - [Use any GGUF Model](https://www.youtube.com/watch?v=9wXJgld7Yow) - [Background on GGUF Quantization & DRAGON Model Example](https://www.youtube.com/watch?v=ZJyQIZNJ45E) - [Getting Started with Whisper.CPP](https://youtu.be/YG5u5AOU9MQ?si=5xQYZCILPSiR8n4s) 🎬 **Core RAG Scenarios Running Locally** - [RAG with BLING on your laptop](https://www.youtube.com/watch?v=JjgqOZ2v5oU) - [DRAGON-7B-Models](https://www.youtube.com/watch?v=d_u7VaKu6Qk&t=37s) - [Use small LLMs for RAG for Contract Analysis (feat. LLMWare)](https://www.youtube.com/watch?v=8aV5p3tErP0) - [Invoice Processing with LLMware](https://www.youtube.com/watch?v=VHZSaBBG-Bo&t=10s) - [Evaluate LLMs for RAG with LLMWare](https://www.youtube.com/watch?v=s0KWqYg5Buk&t=105s) - [Fast Start to RAG with LLMWare Open Source Library](https://www.youtube.com/watch?v=0naqpH93eEU) - [Use Retrieval Augmented Generation (RAG) without a Database](https://www.youtube.com/watch?v=tAGz6yR14lw) 🎬 **Parsing, Embedding, Data Pipelines and Extraction** - [Ingest PDFs at Scale](https://www.youtube.com/watch?v=O0adUfrrxi8&t=10s) - [Install and Compare Multiple Embeddings with Postgres and PGVector](https://www.youtube.com/watch?v=Bncvggy6m5Q) - [Intro to Parsing and Text Chunking](https://youtu.be/2xDefZ4oBOM?si=YZzBUjDfQ0839EVF) # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
--- --- --- --- layout: default title: Other Topics parent: Learn nav_order: 7 description: overview of the major modules and classes of LLMWare permalink: /learn/other_topics --- Other Notable Videos and Topics --- **Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples. Check back often as this list is always growing ... 🎬 **Some of our most recent videos** - [Fast Local Chatbot with Phi-3-GGUF](https://youtu.be/gzzEVK8p3VM?si=HTMWQtN9XuaqjmpK) - [Document Summarization](https://youtu.be/Ps3W-P9A1m8?si=mHvCcHvrKzndaNul) - [Agent Server](https://youtu.be/nsA6-ZdnkXg?si=v7iGhC_rpj8TWbbl) - [Best Small RAG Model - Bling-Phi-3](https://youtu.be/cViMonCAeSc?si=L6jX0sRdZAmKtRcz) - [Agent Automation with Web Services for Financial Research](https://youtu.be/l0jzsg1_Ik0?si=oBGtALHLplouY9x2) - [Voice Transcription and Automated Analysis of Greatest Speeches Dataset](https://youtu.be/5y0ez5ZBpPE?si=PAaCIqYou8nCGxYG) - [Are you prompting wrong for RAG - Stochastic Sampling-Part I](https://youtu.be/7oMTGhSKuNY?si=_KSjuBnqArvWzYbx) - [Are you prompting wrong for RAG - Stochastic Sampling-Part II- Code Experiments](https://youtu.be/iXp1tj-pPjM?si=3ZeMgipY0vJDHIMY) # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
--- --- --- --- layout: default title: Parsing Embedding and Data Extraction parent: Learn nav_order: 5 description: overview of the major modules and classes of LLMWare permalink: /learn/parsing_embedding_data_extraction --- Parsing, Embedding, and Data Extraction --- **Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples. Check back often as this list is always growing ... 🎬 **Parsing, Embedding, Data Pipelines and Extraction** - [Advanced Parsing Techniques](https://youtu.be/dEsw8V_YBYY?si=B0GTVNhwfBYWkXyf) - [Ingest PDFs at Scale](https://www.youtube.com/watch?v=O0adUfrrxi8&t=10s) - [Install and Compare Multiple Embeddings with Postgres and PGVector](https://www.youtube.com/watch?v=Bncvggy6m5Q) - [Intro to Parsing and Text Chunking](https://youtu.be/2xDefZ4oBOM?si=YZzBUjDfQ0839EVF) - [Voice Transcription and Automated Analysis of Greatest Speeches Dataset](https://youtu.be/5y0ez5ZBpPE?si=PAaCIqYou8nCGxYG) # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
--- --- --- --- layout: default title: Using Agents & Function Calls with SLIM Models parent: Learn nav_order: 1 description: overview of the major modules and classes of LLMWare permalink: /learn/using_agents_functions_slim_models --- Using Agents, Function Calls and SLIM Models --- **Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples. Check back often as this list is always growing ... 🎬 **Using Agents, Function Calls and SLIM models** - [Agent Automation with Web Services for Financial Research](https://youtu.be/l0jzsg1_Ik0?si=oBGtALHLplouY9x2) - [Sentiment Analysis](https://youtu.be/ERCHP21oAN8?si=fp6D4Tk9J2HdDRXa) - [SLIMS Playlist](https://youtube.com/playlist?list=PL1-dn33KwsmAHWCWK6YjZrzicQ2yR6W8T&si=TSFGqQ3ObOO5vDde) - [Agent-based Complex Research Analysis](https://youtu.be/y4WvwHqRR60?si=jX3KCrKcYkM95boe) - [Getting Started with SLIMs (with code)](https://youtu.be/aWZFrTDmMPc?si=lmo98_quo_2Hrq0C) - [SLIM Models Intro](https://www.youtube.com/watch?v=cQfdaTcmBpY) - [Text2SQL Intro](https://youtu.be/BKZ6kO2XxNo?si=tXGt63pvrp_rOlIP) - [Pop up LLMWare Inference Server](https://www.youtube.com/watch?v=qiEmLnSRDUA&t=20s) - [Hardest Problem in RAG - handling 'Not Found'](https://youtu.be/slDeF7bYuv0?si=j1nkdwdGr5sgvUtK) - [Extract Information from Earnings Releases](https://youtu.be/d6HFfyDk4YE?si=VmnIiWFmgBtR4DxS) - [Summary Function Calls](https://youtu.be/yNg_KH5cPSk?si=Yl94tp_vKA8e7eT7) - [Boolean Yes-No Function Calls](https://youtu.be/jZQZMMqAJXs?si=lU4YVI0H0tfc9k6e) - [Autogenerate Topics, Tags and NER](https://youtu.be/N6oOxuyDsC4?si=vo2Fd8VG5xTbH4SD) # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
--- --- --- --- layout: default title: Using Quantized GGUF Models parent: Learn nav_order: 3 description: overview of the major modules and classes of LLMWare permalink: /learn/using_quantized_gguf_models --- Using Quantized GGUF Models --- **Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples. Check back often as this list is always growing ... 🎬 **Using GGUF Models** - [Using LM Studio Models](https://www.youtube.com/watch?v=h2FDjUyvsKE) - [Using Ollama Models](https://www.youtube.com/watch?v=qITahpVDuV0) - [Use any GGUF Model](https://www.youtube.com/watch?v=9wXJgld7Yow) - [Background on GGUF Quantization & DRAGON Model Example](https://www.youtube.com/watch?v=ZJyQIZNJ45E) - [Getting Started with Whisper.CPP](https://youtu.be/YG5u5AOU9MQ?si=5xQYZCILPSiR8n4s) - [Best Small RAG Model - Bling-Phi-3](https://youtu.be/cViMonCAeSc?si=L6jX0sRdZAmKtRcz) - [Agent Automation with Web Services for Financial Research](https://youtu.be/l0jzsg1_Ik0?si=oBGtALHLplouY9x2) - [Voice Transcription and Automated Analysis of Greatest Speeches Dataset](https://youtu.be/5y0ez5ZBpPE?si=PAaCIqYou8nCGxYG) - [Are you prompting wrong for RAG - Stochastic Sampling-Part I](https://youtu.be/7oMTGhSKuNY?si=_KSjuBnqArvWzYbx) - [Are you prompting wrong for RAG - Stochastic Sampling-Part II- Code Experiments](https://youtu.be/iXp1tj-pPjM?si=3ZeMgipY0vJDHIMY) # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) # About the project `llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). ## Contributing Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). You can also write an email or start a discussion on our Discrod channel. Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). ## Code of conduct We welcome everyone into the ``llmware`` community. [View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. ## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) ``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. [AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. ## License `llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). ## Thank you to the contributors of ``llmware``!
    {% for contributor in site.github.contributors %}
  • {{ contributor.login }}
  • {% endfor %}
--- ---