# Llmware

> description: community resources, getting help and sharing ideas

---

---
layout: default
title: Community
nav_order: 6
has_children: true
description: community resources, getting help and sharing ideas
permalink: /community
---

# Community 

Welcome to the llmware community!   We are on a mission to pioneer the use of small language models as transformational tools 
in the enterprise to automate workflows and knowledge-based processes cost-effectively, securely and with high quality.  We believe that the 
secret is increasing out that small models can be extremely effective, but require a lot of attention to detail in building scalable data pipelines 
and fine-tuning both models and end-to-end workflows.  

We are open to both the most advanced machine learning researchers and the beginning developer just learning python.  

We publish a wide range of examples, use cases and tutorial videos, and are always looking for feedback, new ideas and contributors.  


{: .note}
> The contributions to `llmware` are governed by our [Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md).

{: .warning}
> Have you found a security issue? Then please jump to [Security Vulnerabilities](#security-vulnerabilities).

On this page, we provide information ``llmware`` contributions.
There are **two ways** on how you can contribute.
The first is by making **code contributions**, and the second by making contributions to the **documentation**.
Please look at our [contribution suggestions](#how-can-you-contribute) if you need inspiration, or take a look at [open issues](#open-issues).

Contributions to `llmware` are welcome from everyone.
Our goal is to make the process simple, transparent, and straightforward.
We are happy to receive suggestions on how the process can be improved.

## How can you contribute?

{: .note}
> If you have never contributed before look for issues with the tag [``good first issue``](https://github.com/llmware-ai/llmware/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22).

The most usual ways to contribute is to add new features, fix bugs, add tests, or add documentation.
You can visit the [issues](https://github.com/llmware-ai/llmware/issues) site of the project and search for tags such as
``bug``, ``enhancement``, ``documentation``, or ``test``.


Here is a non exhaustive list of contributions you can make.

1. Code refactoring
2. Add new text data bases 
3. Add new vector data bases 
4. Fix bugs
5. Add usage examples (see for example the issues [jupyter notebook - more examples and better support](https://github.com/llmware-ai/llmware/issues/508) and [google colab examples and start up scripts](https://github.com/llmware-ai/llmware/issues/507))
6. Add experimental features
7. Improve code quality
8. Improve documentation in the docs (what you are reading right now)
9. Improve documentation by adding or updating docstrings in modules, classes, methods, or functions (see for example [Add docstrings](https://github.com/llmware-ai/llmware/issues/219))
10. Improve test coverage
11. Answer questions in our [Discord channel](https://discord.gg/MhZn5Nc39h), especially in the [technical support forum](https://discord.com/channels/1179245642770559067/1218498778915672194)
12. Post projects in which you use ``llmware`` in our Discord forum [made with llmware](https://discord.com/channels/1179245642770559067/1218567269471486012), ideally with a link to a public GitHub repository

## Open Issues
If you're interested in existing issues, you can

- Look for issues, if you are a new to the project, look for issues with the `good first issue` label.
- Provide answers for questions in our [GitHub discussions](https://github.com/llmware-ai/llmware/discussions)
- Provide help for bug or enhancement issues. 
  - Ask questions, reproduce the issues, or provide solutions.
  - Pull a request to fix the issue.

 
## Security Vulnerabilities
**If you believe you've found a security vulnerability, then please _do not_ submit an issue ticket or pull request or otherwise publicly disclose the issue.**
Please follow the process at [Reporting a Vulnerability](https://github.com/llmware-ai/llmware/blob/main/Security.md)


## GitHub workflow

We follow the [``fork-and-pull``](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request-from-a-fork) Git workflow.

1.  [Fork](https://docs.github.com/en/github/getting-started-with-github/fork-a-repo) the repository on GitHub.
2. Clone your fork to your local machine with `git clone git@github.com:<yourname>/llmware.git`.
3. Create a branch with `git checkout -b my-topic-branch`.
4. Run the test suite by navigating to the tests/ folder and running ```./run-tests.py -s``` to ensure there are no failures
5. [Commit](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/committing-changes-to-a-pull-request-branch-created-from-a-fork) changes to your own branch, then push to GitHub with `git push origin my-topic-branch`.
6. Submit a [pull request](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests) so that we can review your changes.

Remember to [synchronize your forked repository](https://docs.github.com/en/github/getting-started-with-github/fork-a-repo#keep-your-fork-synced) _before_ submitting proposed changes upstream. If you have an existing local repository, please update it before you start, to minimize the chance of merge conflicts.

```shell
git remote add upstream git@github.com:llmware-ai/llmware.git
git fetch upstream
git checkout upstream/main -b my-topic-branch
```

## Community
Questions and discussions are welcome in any shape or form.
Please fell free to join our community on our discord channel, on which we are active daily.
You are also welcome if you just want to post an idea!

- [Discord Channel](https://discord.gg/MhZn5Nc39h)
- [GitHub discussions](https://github.com/llmware-ai/llmware/discussions)

---

---
layout: default
title: FAQ
parent: Community
nav_order: 1
description: overview of the major modules and classes of LLMWare  
permalink: /community/faq
---
# Frequently Asked Questions (FAQ)


### How can I set the chunk size?
#### "I want to parse my documents into smaller chunks"
You can set the chunk size with the ``chunk_size`` parameter of the ``add_files`` method.

The ``add_files`` method from the ``Library`` class has a ``chunk_size`` parameter that controls the chunk size.
The method in addition has a parameter to control the maximum chunk size with ``max_chunk_size``.
These two parameters are passed on to the ``Parser`` class.
In the following example, we add the same files with different chunk sizes to the library ``chunk_size_example``.
```python
from pathlib import Path

from llmware.library import Library


path_to_my_library_files = Path('~/llmware_data/sample_files/Agreements')

my_library = Library().create_new_library(library_name='chunk_size_example')
my_library.add_files(input_folder_path=path_to_my_library_files, chunk_size=400)
my_library.add_files(input_folder_path=path_to_my_library_files, chunk_size=600)
```

### How can I set the embedding store?
#### "I want to use a specific embedding store"
You can set the embedding store with the ``vector_db`` parameter of the ``install_new_embedding`` method, which you call on a ``Library`` object each time you want to create an embedding for a *library*.

The ``install_new_embedding`` method from the ``Library`` class has a ``vector_db`` parameter that sets the embedding store.
At the moment of this writing, *LLMWare* supports the embedding stores [chromadb](https://github.com/chroma-core/chroma), [neo4j](https://github.com/neo4j/neo4j), [milvus](https://github.com/milvus-io/milvus), [pg_vector](https://github.com/pgvector/pgvector), [postgres](https://github.com/postgres/postgres), [redis](https://github.com/redis/redis), [pinecone](https://www.pinecone.io/), [faiss](https://github.com/facebookresearch/faiss), [qdrant](https://github.com/qdrant/qdrant), [mongo atlas](https://www.mongodb.com/products/platform/atlas-database), and [lancedb](https://github.com/lancedb/lancedb).
In the following example, we create the same embeddings three times for the same library, but store them in three different embedding stores.
```python
import logging
from pathlib import Path

from llmware.configs import LLMWareConfig
from llmware.library import Library


logging.info(f'Currently supported embedding stores: {LLMWareConfig().get_supported_vector_db()}')

library = Library().create_new_library(library_name='embedding_store_example')
library.add_files(input_foler_path=Path('~/llmware_data/sample_files/Agreements'))

library.install_new_embedding(vector_db="pg_vector")
library.install_new_embedding(vector_db="milvus")
library.install_new_embedding(vector_db="faiss")
```

### How can I set the collection store?
#### "I want to use a specific collection store"
You can set the collection store with the ``set_active_db`` method of the ``LLMWareConfig`` class.

The collection store is set using the ``LLMWareConfig`` class with the ``set_active_db`` method.
At the time of writing, **LLMWare** supports the three collection stores *MongoDB*, *Postgres*, and *SQLite* - which is the default.
You can retrieve the supported collection store with the method ``get_supported_collection_db``.
In the example below, we first print the currently active collection store, then we retrieve the supported collection stores, before we switch to *Postgres*.

```python
import logging

from llmware.configs import LLMWareConfig


logging.info(f'Currently active collection store: {LLMWareConfig.get_active_db()}')
logging.info(f'Currently supported collection stores: {LLMWareConfig().get_supported_collection_db()}')

LLMWareConfig.set_active_db("postgres")
logging.info(f'Currently active collection store: {LLMWareConfig.get_active_db()}')
```


### How can I retrieve more context?
#### "I want to retrieve more context from a query"
One way to retrieve more context is to set the ``result_count`` parameter of the ``query``, ``text_query``, and ``semantic_query`` methods from the ``Query`` class.
By increasing ``result_count``, the number of retrieved results is increased which increases the context size.

The ``Query`` class has the methods ``query``, ``text_query``, and ``semantic_query`` methods which allow to set the number of retrieved results with ``result_count``.
On a side note, ``query`` is a wrapper function for ``text_query`` and ``semantic_query``.
The value of ``result_count`` is passed on to the queried embedding store to control the number of retrieved results.
For example, for *pgvector* ``result_count`` is passed on to the value after the ``LIMIT`` keyword.
In the ``SQL`` example below, you can see the resulting ``SQL`` query of ``LLMWare`` if ``result_count=10``, the name of the collection being ``agreements``, and the query vector being ``[1, 2, 3]``.
```sql
SELECT
    id,
    block_mongo_id,
    embedding <-> '[1, 2, 3]' AS distance,
    text
FROM agreements
ORDER BY distance
LIMIT 10;
```
In the following example, we execute the same query against a library twice but change the number of retrieved results from ``3`` to ``6``.
```python
import logging
from pathlib import Path

from llmware.configs import LLMWareConfig
from llmware.library import Library
from llmware.retrieval import Query

logging.info(f'Currently supported embedding stores: {LLMWareConfig().get_supported_vector_db()}')

library = Library().create_new_library(library_name='context_size_example')
library.add_files(input_foler_path=Path('~/llmware_data/sample_files/Agreements'))
library.install_new_embedding(vector_db="pg_vector")

query = Query(library)
query_results = query.semantic_query(query='salary', result_count=3, results_only=True)
logging.info(f'Number of results: {len(query_results)}')
query_results = query.semantic_query(query='salary', result_count=6, results_only=True)
logging.info(f'Number of results: {len(query_results)}')
```

### How can I set the Large Language Model?
#### "I want to use a different LLM"
You can set the Large Language Model (LLM) with the ``gen_model`` parameter of the ``load_model`` method from the ``Prompt`` class.

The ``Prompt`` class has the method ``load_model`` with the ``gen_model`` parameter which sets the LLM.
The ``gen_model`` parameter is passed on to the ``ModelCatalog`` class, which loads the LLM either from HuggingFace or from another source.
The ``ModelCatalog`` allows you to **list all available models** with the method ``list_generative_models``, or just the local models ``list_generative_local_models``, or just the open source models ``list_open_source_models``.
In the example below, we log all available LLMs, including the ones that are available locally and the open source ones, and also create the prompters.
Each prompter uses a different LLM from our [BLING model series](https://llmware.ai/about), which you can also find on [HuggingFace](https://huggingface.co/collections/llmware/bling-models-6553c718f51185088be4c91a).

```python
import logging

from llmware.models import ModelCatalog
from llmware.prompts import Prompt


llm_gen = ModelCatalog().list_generative_models()
logging.info(f'List of all LLMs: {llm_gen}')

llm_gen_local = ModelCatalog().list_generative_local_models()
logging.info(f'List of all local LLMs: {llm_local}')

llm_gen_open_source = ModelCatalog().list_open_source_models()
logging.info(f'List of all open source LLMs: {llm_gen_open_source}')


prompter_bling_1b = Prompt().load_model(gen_model='llmware/bling-1b-0.1')
prompter_bling_tiny_llama = Prompt().load_model(gen_model='llmware/bling-tiny-llama-v0')
prompter_bling_falcon_1b = Prompt().load_model(gen_model='llmware/bling-falcon-1b-0.1')
```

### How can I set the embedding model?
#### "I want to use a different embedding model"
You can set the embedding model with the ``embedding_model_name`` parameter of the ``install_new_embedding`` method from the ``Library`` class.

The ``Library`` class has the method ``install_new_embedding`` with the ``embedding_model_name`` parameter which sets the embedding model.
The ``ModelCatalog`` allows you to **list all available embedding models** with the ``list_embedding_models`` method.
In the following example, we list all available embedding models, and then we create a library with the name ``embedding_models_example``, which we embed two times with embedding models ``'mini-lm-sber'`` and ``'industry-bert-contracts'``.

```python
import logging

from llmware.models import ModelCatalog
from llmware.library import Library


embedding_models = ModelCatalog().list_generative_models()
logging.info(f'List of embedding models: {embedding_models}')


library = Library().create_new_library(library_name='embedding_models_example')
library.add_files(input_foler_path=Path('~/llmware_data/sample_files/Agreements'))

library.install_new_embedding(embedding_model_name='mini-lm-sber')
library.install_new_embedding(embedding_model_name='industry-bert-contracts')
```

### Why is the model running slowly in Google Colab? 
#### "I want to improve the performance of my model on Google Colab"

Our models are designed to run on at least 16GB of RAM. By default Google Colab provides ~13GB of RAM, which significantly slows computational speed. To ensure the best performance when using our models, we highly recommend enabling the T4 GPU in Colab. This will provide the notebook with additional resources, including 16GB of RAM, allowing our models to run smoothly and efficiently.

Steps to enabling T4 GPU in Colab:
1. In your Colab notebook, click on the "Runtime" tab
2. Select "Change runtime type"
3. Under "Hardware Accelerator", select T4 GPU

NOTE: There is a weekly usage limit on using T4 for free.

---

---
layout: default
title: Join Our Community
parent: Community
nav_order: 4
description: overview of the major modules and classes of LLMWare  
permalink: /community/join_our_community
---
# Join the LLMWare Community
___


# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discord channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
   <li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Need Hep
parent: Community
nav_order: 3
description: overview of the major modules and classes of LLMWare  
permalink: /community/need_help
---
# Need Help 
___


# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discord channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
  <li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Troubleshooting
parent: Community
nav_order: 2
description: overview of the major modules and classes of LLMWare  
permalink: /community/troubleshooting
---
# Common Troubleshooting Issues
___


1. **Can not install the pip package**  

    -- Check your Python version.   If using Python 3.9-3.11, then almost any version of llmware should work.  If using an older Python (before 3.9), then it is likely that dependencies will fail in the pip process.  If you are using Python 3.12, then you need to use llmware>=0.2.12.  
    
    -- Dependency constraint error.   If you receive a specific error around a dependency version constraint, then please raise an issue and include details about your OS, Python version, any unique elements in your virtual environment, and specific error.   


2. **Parser module not found**

    -- Check your OS and confirm that you are using a [supported platform](platforms.md/#platform-support).  
    -- If you cloned the repository, please confirm that the /lib folder has been copied into your local path.  


3. **Pytorch Model not loading**

   -- Confirm the obvious stuff - correct model name, model exists in Huggingface repository, connected to the Internet with open ports for HTTPS connection, etc.  

   -- Check Pytorch version - update Pytorch to >2.0, which is required for many recent models released in the last 6 months, and in some cases, may require other dependencies not included in the llmware package.  
        --note: we have seen some compatibility issues with Pytorch==2.3 on Wintel platforms - if you run into these issues, we recommend using a back-level Pytorch==2.1, which we have seen fixing the issue.  

4. **GGUF Model not loading**

   -- Confirm that you are using llmware>=0.2.11 for the latest GGUF support.  

   -- Confirm that you are using a [supported platform](platforms.md/#platform-support).  We provide pre-built binaries for llama.cpp as a back-end GGUF engine on the following platforms:  
        
        - Mac M1/M2/M3 - OS version 14 - "with accelerate framework"
        - Mac M1/M2/M3 - OS older versions - "without accelerate framework"  
        - Windows - x86
        - Windows with CUDA  
        - Linux - x86  (Ubuntu 20+)
        - Linux with CUDA  (Ubuntu 20+)  
   
If you are using a different OS platform, you have the option to "bring your own llama.cpp" lib as follows:  

```python
from llmware.gguf_configs import GGUFConfigs
GGUFConfigs().set_config("custom_lib_path", "/path/to/your/libllama_binary")  
```

If you have any trouble, feel free to raise an Issue and we can provide you with instructions and/or help compiling llama.cpp for your platform.  
        
   -- Specific GGUF model - if you are successfully using other GGUF models, and only having problems with a specific model, then please raise an Issue, and share the specific model and architecture.  


5. **Example not working as expected** - please raise an issue, so we can evaluate and fix any bugs in the example code.  Also, pull requests are always especially welcomed with a fix or improvement in an example.  


6. **Model not leveraging CUDA available in environment.**  

    -- **Check CUDA drivers installed correctly** - easy check of the NVIDIA CUDA drivers is to use `nvidia-smi` and `nvcc --version` from the command line.  Both commands should respond positively with details on the versions and implementations.  Any errors indicates that either the driver or CUDA toolkit are not installed or recognized.  It can be complicated at times to debug the environment, usually with some trial and error.   See extensive [Nvidia Developer documentation](https://docs.nvidia.com) for trouble-shooting steps, specific to your environment.  

    -- **Check CUDA drivers are up to date** - we build to CUDA 12.1, which translates to a minimum of 525.60 on Linux, and 528.33 on Windows.  

    -- **Pytorch model** - check that Pytorch is finding CUDA, e.g., `torch.cuda.is_available()` == True.   We have seen issues on Windows, in particular, to confirm that your Pytorch version has been compiled with CUDA drivers.  For Windows, in particular, we have found that you may need to compile a CUDA-specific version of Pytorch, using the following command:  
    
    ```pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121```
    
    -- **GGUF model** - logs will be displayed on the screen confirming that CUDA is being used, or whether 'fall-back' to CPU drivers.  We run a custom CUDA install check, which you can run on your system with:  
        ```gpu_status = ModelCatalog().gpu_available``` 
        
       If you are confirming CUDA present, but fall-back to CPU is being used, you can set the GGUFConfigs to force to CUDA:  
        ```GGUFConfigs().set_config("force_gpu", True)```  
      
       If you are looking to use specific optimizations, you can bring your own llama.cpp lib as follows:
        ```GGUFConfigs().set_config("custom_lib_path", "/path/to/your/custom/llama_cpp_backend")``` 

    -- If you can not debug after these steps, then please raise an Issue.   We are happy to dig in and work with you to run FAST local inference.  


7.  **Model result inconsistent**  

    -- when loading the model, set `temperature=0.0` and `sample=False` -> this will give a deterministic output for better testing and debugging.  

    -- usually the issue will be related to the retrieval step and formation of the Prompt, and as always, good pipelines and a little experimentation usually help !  


8. **Newly added examples not working as intended**

    -- If you run a recently added example and it does not run as intended, it is possible that the feature being used in the example has not yet been added to the latest pip install.

    -- To fix this, move the example file to the outer-most directory of the repository, so that the example file you are trying to run is in the same directory as the `llmware` source code directory.

    -- This will let you run the example using the latest source code!


9. **Git permission denied error**

    -- If you are using SSH to clone the repository and you get an error that looks similar to `git@github.com: Permission denied (publickey)`, then you might not have configured your SSH key correctly.

    -- If you don't already have one, you will need to create a new SSH key on your local machine. For instructions on how to do this, check out this page: https://docs.github.com/en/authentication/connecting-to-github-with-ssh/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent.

    -- You then need to add the SSH key to your GitHub account. For instructions on how to do this, check out this page: https://docs.github.com/en/authentication/connecting-to-github-with-ssh/adding-a-new-ssh-key-to-your-github-account.


# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discord channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
  <li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Agent Inference Server 
parent: Components
nav_order: 12
description: overview of the major modules and classes of LLMWare  
permalink: /components/agent_inference_server
---
# Agent Inference Server
---

LLMWare supports multiple deployment options, including the use of REST APIs to implement most model invocations.  

To set up an inference server for Agent processes:  

```python

""" This example shows how to set up an inference server that can be used in conjunction with agent-based workflows.

    This script covers both the server-side deployment, as well as the steps taken on the client-side to deploy
    in an Agent example.

    Note: this example will build off two other examples:

        1.  "examples/Models/launch_llmware_inference_server.py"
        2.  "examples/SLIM-Agents/agent-llmfx-getting-started.py"

"""


from llmware.models import ModelCatalog, LLMWareInferenceServer

#   *** SERVER SIDE SCRIPT ***

base_model = "llmware/bling-tiny-llama-v0"
LLMWareInferenceServer(base_model,
                       model_catalog=ModelCatalog(),
                       secret_api_key="demo-test",
                       home_path="/home/ubuntu/",
                       verbose=True).start()

#   this will start Flask-based server, which will display the launched IP address and port, e.g.,
#   "Running on " ip_address = "http://127.0.0.1:8080"


#   *** CLIENT SIDE AGENT PROCESS ***


from llmware.agents import LLMfx


def create_multistep_report_over_api_endpoint():

    """ This is derived from the script in the example agent-llmfx-getting-started.py. """

    customer_transcript = "My name is Michael Jones, and I am a long-time customer.  " \
                          "The Mixco product is not working currently, and it is having a negative impact " \
                          "on my business, as we can not deliver our products while it is down. " \
                          "This is the fourth time that I have called.  My account number is 93203, and " \
                          "my user name is mjones. Our company is based in Tampa, Florida."

    #   create an agent using LLMfx class
    agent = LLMfx()

    #   copy the ip address from the Flask launch readout
    ip_address = "http://127.0.0.1:8080"

    #   inserting this line below into the agent process sets the 'api endpoint' execution to "ON"
    #   all agent function calls will be deployed over the API endpoint on the remote inference server
    #   to "switch back" to local execution, comment out this line

    agent.register_api_endpoint(api_endpoint=ip_address,
                                api_key="demo-test",
                                endpoint_on=True)

    #   to explicitly turn the api endpoint "on" or "off"
    # agent.switch_endpoint_on()
    # agent.switch_endpoint_off()

    agent.load_work(customer_transcript)

    #   load tools individually
    agent.load_tool("sentiment")
    agent.load_tool("ner")

    #   load multiple tools
    agent.load_tool_list(["emotions", "topics", "intent", "tags", "ratings", "answer"])

    #   start deploying tools and running various analytics

    #   first conduct three 'soft skills' initial assessment using 3 different models
    agent.sentiment()
    agent.emotions()
    agent.intent()

    #   alternative way to execute a tool, passing the tool name as a string
    agent.exec_function_call("ratings")

    #   call multiple tools concurrently
    agent.exec_multitool_function_call(["ner","topics","tags"])

    #   the 'answer' tool is a quantized question-answering model - ask an 'inline' question
    #   the optional 'key' assigns the output to a dictionary key for easy consolidation
    agent.answer("What is a short summary?",key="summary")

    #   prompting tool to ask a quick question as part of the analytics
    response = agent.answer("What is the customer's account number and user name?", key="customer_info")

    #   you can 'unload_tool' to release it from memory
    agent.unload_tool("ner")
    agent.unload_tool("topics")

    #   at end of processing, show the report that was automatically aggregated by key
    report = agent.show_report()

    #   displays a summary of the activity in the process
    activity_summary = agent.activity_summary()

    #   list of the responses gathered
    for i, entries in enumerate(agent.response_list):
        print("update: response analysis: ", i, entries)

    output = {"report": report, "activity_summary": activity_summary, "journal": agent.journal}

    return output
```


Need help or have questions?
============================

Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware).

Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions).


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discord channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
 <li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>

    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Agents 
parent: Components
nav_order: 4
description: overview of the major modules and classes of LLMWare  
permalink: /components/agents
---
# Agents
---

Agents with Function Calls and SLIM Models 🔥 

llmware has been designed to enable Agent and LLM-based function calls using small language models designed for local and private 
deployment and the ability to leverage open source models to conduct complex RAG and knowledge-based workflow automation.  

The key elements in llmware:  

 - **SLIM models** - 18 function-calling small language models, optimized for a specific extraction, classification, generation, or 
summarization activity, and generate python dictionaries and lists as output.  

- **LLMfx class** - enables a wide range of agent-based processes.  

Here is an example to get started: 

```python

from llmware.agents import LLMfx

text = ("Tesla stock fell 8% in premarket trading after reporting fourth-quarter revenue and profit that "
        "missed analysts’ estimates. The electric vehicle company also warned that vehicle volume growth in "
        "2024 'may be notably lower' than last year’s growth rate. Automotive revenue, meanwhile, increased "
        "just 1% from a year earlier, partly because the EVs were selling for less than they had in the past. "
        "Tesla implemented steep price cuts in the second half of the year around the world. In a Wednesday "
        "presentation, the company warned investors that it’s 'currently between two major growth waves.'")

#   create an agent using LLMfx class
agent = LLMfx()

#   load text to process
agent.load_work(text)

#   load 'models' as 'tools' to be used in analysis process
agent.load_tool("sentiment")
agent.load_tool("extract")
agent.load_tool("topics")
agent.load_tool("boolean")

#   run function calls using different tools
agent.sentiment()
agent.topics()
agent.extract(params=["company"])
agent.extract(params=["automotive revenue growth"])
agent.xsum()
agent.boolean(params=["is 2024 growth expected to be strong? (explain)"])

#   at end of processing, show the report that was automatically aggregated by key
report = agent.show_report()

#   displays a summary of the activity in the process
activity_summary = agent.activity_summary()

#   list of the responses gathered
for i, entries in enumerate(agent.response_list):
    print("update: response analysis: ", i, entries)

output = {"report": report, "activity_summary": activity_summary, "journal": agent.journal}  

```


Need help or have questions?
============================

Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware).

Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions).


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discord channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Components
nav_order: 3
has_children: true
description: llmware key architectural components, modules and classes
permalink: /components
---
# LLMWare Architecture
---

llmware is characterized by a logically integrated set of data pipelines involved in building LLM-based workflows, centered on two main sub-pipelines with high-level interfaces intended to provide an abstraction layer over individual 'end point' components to promote code re-use and the ability to easily 'swap' different components with minimal, if any, code change:  

**1.  Knowledge Ingestion** - "creating Gen Ai Food" - ingesting and organizing unstructured information from a wide range of data sources, including each of the major steps:  

    - Extracting and Parsing
    - Text Chunking
    - Indexing, Organizing and Storing
    - Embedding
    - Retrieval
    - Analytics and Reuse of Content  
    - Combining with SQL Table and Other Structured Content


   **Core LLMWare classes**:  **Library**, **Query** (retrieval module), **Parser**, **EmbeddingHandler** (embeddings module), **Graph**, **CustomTables** (resources module) and **Datasets** dataset_tools module).   
   
   In many cases, it is easy to get things done in LLMWare using only **Library** and **Query** - which provide convenient interfaces into parsing and embedding such that most use cases will not require calling those classes directly.  

   Supported document file types:  pdf, pptx, docx, xlsx, txt, csv, html, jsonl, json, tsv, jpg, jpeg, png, wav, zip, md, mp3, mp4, m4a  

   Key methods to know:  

        - Ingest anything  - `Library().add_files(input_folder_path="path/to/docs")`
   
        - Embed library    - `Library().install_new_embedding(embedding_model_name="your embedding model", vector_db="your vector db")`
   
        - Run Query        - `Query(library).query(query, query_type="semantic", result_count=20)`  
   
   Top examples to get started:  

   - [Parsing examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing) - ~14 stand-alone parsing examples for all common document types, including options for parsing in memory, outputting to JSON, parsing custom configured CSV and JSON files, running OCR on embedded images found in documents, table extraction, image extraction, text chunking, zip files, and web sources.  
   - [Embedding examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Embedding) - ~15 stand-alone embedding examples to show how to use ~10 different vector databases and wide range of leading open source embedding models (including sentence transformers).  
   - [Retrieval examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Retrieval) - ~10 stand-alone examples illustrating different query and retrieval techniques - semantic queries, text queries, document filters, page filters, 'hybrid' queries, author search, using query state, and generating bibliographies.  
   - [Dataset examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Datasets) - ~5 stand-alone examples to show 'next steps' of how to leverage a Library to re-package content into various datasets and automated NLP analytics.  
   - [Fast start example #1-Parsing](https://www.github.com/llmware-ai/llmware/tree/main/fast_start/example-1-create_first_library.py) - shows the basics of parsing.  
   - [Fast start example #2-Embedding](https://www.github.com/llmware-ai/llmware/tree/main/fast_start/example-2-build_embeddings.py) - shows the basics of building embeddings.  
   - [CustomTable examples](https://www.github.com/llmware-ai/llmware/tree/main/Structured_Tables) - ~5 examples to start building structured tables that can be used in conjunction with LLM-based workflows.  


**2.  Model Prompting** - "Fun with LLMs" - the lifecycle of discovering, instantiating, and configuring an LLM-based model to execute an inference, including the ability to seamlessly prepare and integrate knowledge retrieval, and post-processing steps to validate accuracy, including:  
    
    - ModelCatalog - discover, load and manage configuration  
    - Inference
    - Function Calls  
    - Prompts  
    - Prompt with Sources
    - Fact Checking methods
    - Agent-based multi-step processes
    - Prompt History

   Core LLMWare classes:  **ModelCatalog** (models module), **Prompt**, **LLMfx** (agents module).
    
   Key methods to know:  

        - Discover Models - `ModelCatalog().list_all_models()`  

        - Load Model      - `model = ModelCatalog().load_model(model_name)`
        
        - Inference       - `response = model.inference(prompt, add_context=context)`  
        
        - Prompt          -  wraps the model class to provide easy source/retrieval management  
        
        - LLMfx           -  wraps the model class for function-calling SLIM models for agent processes  

   While ~17 individual model classes are exposed in the models module, for most use cases, we recommend working through the higher-level interface of ModelCatalog, as it promotes code re-use and the easy ability to swap models.  In many pipelines, even ModelCatalog is not required to be called directly, as the Prompt class (knowledge retrieval) and LLMfx (agents and function calls) class provide seamless workflow capabilities and are built on top of the ModelCatalog.  

   Top examples to get started:  
   - [Models examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Models) - ~20 examples showing a wide range of different model inferences and use cases, including the ability to integrate Ollama models, OpenChat (e.g., LMStudio) models, using LLama-3 and Phi-3, bringing your own models into the ModelCatalog, and configuring sampling settings.  
   - [Prompts examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Prompts) - ~5 examples that illustrate how to use Prompt as an integrated workflow for integrating knowledge sources, managing prompt history, and applying fact-checking.  
   - [SLIM-Agents examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/SLIM-Agents) - ~20 examples showing how to build multi-model, multi-step Agent processes using locally-running SLIM function calling models.  
   - [Fast start example #3-Prompts and Models](https://www.github.com/llmware-ai/llmware/tree/main/fast_start/example-3-prompts_and_models.py) - getting started with model inference. 


In addition, to support these two key pipelines, LLMWare has a set of supporting and enabling classes and methods, including: 

    - resource module:  CollectionRetrieval, CollectionWriter, PromptState, QueryState, and ParserState - provides an abstraction layer on top of underlying database repositories and separate state mechanisms for major classes.   
    - gguf_configs module: GGUFConfigs 
    - model_configs module: global_model_repo_catalog_list, global_model_finetuning_prompt_wrappers_lookup, global_default_prompt_catalog  
    - util module:  Utilities  
    - setup module: Setup  
    - status module: Status
    - exceptions module: LLMWare Exceptions
    - web_services module: classes for Wikipedia, YFinance, and WebSite extraction  


**End-to-End Use Cases** - we publish and maintain a number of end-to-end use cases in [examples/Use_Cases](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases)  


Need help or have questions?
============================

Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware).

Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions).


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discord channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
   <li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Data Stores 
parent: Components
nav_order: 9
description: overview of the major modules and classes of LLMWare  
permalink: /components/data_stores
---
# Data Stores
---

Simple-to-Scale Database Options - integrated data stores from laptop to parallelized cluster.   

```python

from llmware.configs import LLMWareConfig

#   to set the collection database - mongo, sqlite, postgres  
LLMWareConfig().set_active_db("mongo")  

#   to set the vector database (or declare when installing)  
#   --options: milvus, pg_vector (postgres), redis, qdrant, faiss, pinecone, mongo atlas  
LLMWareConfig().set_vector_db("milvus")  

#   for fast start - no installations required  
LLMWareConfig().set_active_db("sqlite")  
LLMWareConfig().set_vector_db("chromadb")   # try also faiss and lancedb  

#   for single postgres deployment  
LLMWareConfig().set_active_db("postgres")  
LLMWareConfig().set_vector_db("postgres")  

#   to install mongo, milvus, postgres - see the docker-compose scripts as well as examples

```


Need help or have questions?
============================

Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware).

Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions).


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discord channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
  <li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Embedding Models
parent: Components
nav_order: 6
description: overview of the major modules and classes of LLMWare  
permalink: /components/embedding_models
---
# Embedding Models
---

llmware supports 30+ embedding models out of the box in the default ModelCatalog, with easy extensibility to add other 
popular open source embedding models from HuggingFace or Sentence Transformers.  

To get a list of the currently supported embedding models:  

```python
from llmware.models import ModelCatalog
embedding_models = ModelCatalog().list_embedding_models()
for i, models in enumerate(embedding_models):
    print(f"embedding models: {i} - {models}")
```

Supported popular models include:  
- Sentence Transformers - `all-MiniLM-L6-v2`, `all-mpnet-base-v2`  
- Jina AI - `jinaai/jina-embeddings-v2-base-en`, `jinaai/jina-embeddings-v2-small-en`  
- Nomic - `nomic-ai/nomic-embed-text-v1`  
- Industry BERT - `industry-bert-insurance`, `industry-bert-contracts`, `industry-bert-asset-management`, `industry-bert-sec`, `industry-bert-loans`  
- OpenAI - `text-embedding-ada-002`, `text-embedding-3-small`, `text-embedding-3-large`

We also support top embedding models from BAAI, thenlper, llmrails/ember, Google, and Cohere.  We are constantly looking to add new innovative open source models to this list 
so please let us know if you are looking for support for a specific embedding model, and usually within 1-2 days, we can test and add to the ModelCatalog.  

# Using an Embedding Model  

Embedding models in llmware can be installed directly by `ModelCatalog().load_model("model_name")`, but in most cases, 
the name of the embedding model will be passed to the `install_new_embedding` handler in the Library class when creating a new 
embedding.   Once that is completed, the embedding model is captured in the Library metadata on the LibraryCard as part of the 
embedding record for that library, and as a result, often times, does not need to be used explicitly again, e.g.,  

```python

from llmware.library import Library

library = Library().create_new_library("my_library")

# parses the content from the documents in the file path, text chunks and indexes in a text collection database
library.add_files(input_folder_path="/local/path/to/my_files", chunk_size=400, max_chunk_size=600, smart_chunking=1)

# creates embeddings - and keeps synchronized records of which text chunks have been embedded to enable incremental use
library.install_new_embedding(embedding_model_name="jinaai/jina-embeddings-v2-small-en", 
                              vector_db="milvus",
                              batch_size=100)
```

Once the embeddings are installed on the library, you can look up the embedding status to see the updated embeddings, and confirm that 
the model has been correctly captured:  

```python

from llmware.library import Library
library = Library().load_library("my_library")
embedding_record = library.get_embedding_status()
print("\nupdate:  embedding record - ", embedding_record)
```

And then you can run semantic retrievals on the Library, using the Query class in the retrievals module, e.g.:

```python 
from llmware.library import Library
from llmware.retrieval import Query
library = Library().load_library("my_library")
#   queries are constructed by creating a Query object, and passing a library as input
query_results = Query(library).semantic_query("my query", result_count=20)
for qr in query_results:
    print("my query results: ", qr)
```


Need help or have questions?
============================

Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware).

Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions).


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discord channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
   <li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: GGUF
parent: Components
nav_order: 14
description: overview of the major modules and classes of LLMWare  
permalink: /components/gguf
---
# GGUF
---

llmware packages its own build of the llama.cpp backend engine to enable running quantized models in GGUF format, which provides an 
effective packaging to run small language models on both CPUs and GPUs, which fast loading and inference.  

The GGUF capability is implemented in the models.py module in the class `GGUFGenerativeModel` with an extensive set of interfaces and 
configurations provided in the gguf_configs.py module (which for most users and use cases do not need to adjusted).  

To use a GGUF model is the same as using any other model in the ModelCatalog, e.g.,

```python
from llmware.models import ModelCatalog

gguf_model = ModelCatalog().load_model("phi-3-gguf")  
response = gguf_model.inference("What are the benefits of small specialized language models?")
print("response: ", response)
```

#   GGUF Platform Support 
Within the llmware library, we currently package 6 separate builds of the gguf llama.cpp engine for the following platforms:  

# Mac M1/M2/M3
 - with Accelerate:  "libllama_mac_metal.dylib"
 - without Accelerate: "libllama_mac_metal_no_acc.dylib" (note: if you have an old Mac OS installed, it may not have full Accelerate support)  
 - By default on Mac M1/M2/M3, it will attempt to use the Accelerate (faster) back-end, and if that fails, then it will automatically revert to the no-acc version 

# Windows
  - CUDA version 
  - CPU version 
  - Will look for CUDA drivers, and if found, will try to use the CUDA build, but if that fails, then it will automatically revert to the CPU version.  

# Linux
  - CUDA version
  - CPU version 
  - Will look for CUDA drivers, and if found, will try to use the CUDA build, but if that fails, then it will automatically revert to the CPU version.  


# Troubleshooting CUDA on Windows and Linux

Requirement:  Nvidia CUDA 12.1+  
-- how to check:  `nvcc --version` and `nvidia-smi` - if not found, then drivers are either not installed or not in $PATH and need to be configured 
-- if you have older drivers (e.g., v11), then you will need to update them.  

# Bring your own custom llama.cpp gguf backend

If you have a unique system requirement, or are looking to optimize for a particular BLAS library with your own build, you can bring your own as follows:  
if you have a unique system requirement, you can build llama_cpp from source, and apply custom build settings - or find in the community a prebuilt llama_cpp library that matches your platform.  Happy to help if you share the requirements. 

```python
from llmware.gguf_configs import GGUFConfigs
GGUFConfigs().set_config("custom_lib_path", "/path/to/your/custom/llama_cpp_backend")  

# ... and then load and run the model as usual - the GGUF model class will look at this config and load the llama.cpp found at the custom lib path.  
```

# Streaming GGUF

```python

""" This example illustrates how to use the stream method for GGUF models for fast streaming of inference,
especially for real-time chat interactions.

    Please note that the stream method has been implemented for GGUF models starting in llmware-0.2.13.  This will be
any model with GGUFGenerativeModel class, and generally includes models with names that end in "gguf".

    See also the chat UI example in the UI examples folder.

    We would recommend using a chat optimized model, and have included a representative list below.
"""


from llmware.models import ModelCatalog
from llmware.gguf_configs import GGUFConfigs

#   sets an absolute output maximum for the GGUF engine - normally set by default at 256
GGUFConfigs().set_config("max_output_tokens", 1000)

chat_models = ["phi-3-gguf",
               "llama-2-7b-chat-gguf",
               "llama-3-instruct-bartowski-gguf",
               "openhermes-mistral-7b-gguf",
               "zephyr-7b-gguf",
               "tiny-llama-chat-gguf"]

model_name = chat_models[0]

#   maximum output can be set optionally at any number up to the "max_output_tokens" set
model = ModelCatalog().load_model(model_name, max_output=500)

text_out = ""

token_count = 0

# prompt = "I am interested in gaining an understanding of the banking industry.  What topics should I research?"
prompt = "What are the benefits of small specialized LLMs?"

#   since model.stream provides a generator, then use as follows to consume the generator

for streamed_token in model.stream(prompt):

    text_out += streamed_token
    if text_out.strip():
        print(streamed_token, end="")

    token_count += 1

#   final output text and token count

print("\n\n***total text out***: ", text_out)
print("\n***total tokens***: ", token_count)
```

Need help or have questions?
============================

Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware).

Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions).


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
  <li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Library 
parent: Components
nav_order: 7
description: overview of the major modules and classes of LLMWare  
permalink: /components/library
---
# Library:  ingest, organize and index a collection of knowledge at scale - Parse, Text Chunk and Embed.  
---

Library is the main organizing construct for unstructured information in LLMWare.   Users can create one large library with all types of different content, or
can create multiple libraries with each library comprising a specific logical collection of information on a
particular subject matter, project/case/deal, or even different accounts/users/departments.

Each Library consists of the following components:

1. Collection on a Database - this is the core of the Library, and is created through parsing of documents, which
are then automatically chunked and indexed in a text collection database.  This is the basis for retrieval,
and the collection that will be used as the basis for tracking any number of vector embeddings that can be
attached to a library collection.

2. File archives - found in the llmware_data path, within Accounts, there is a folder structure for each Library.
All file-based artifacts for the Library are organized in these folders, including copies of all files added in
the library (very useful for retrieval-based applications), images extracted and indexed from the source
documents, as well as derived artifacts such as nlp and knowledge graph and datasets.

3. Library Catalog - each Library is registered in the LibraryCatalog table, with a unique library_card that has
the key attributes and statistics of the Library.

When a Library object is passed to the Parser, the parser will automatically route all information into the
Library structure.

The Library also exposes convenience methods to easily install embeddings on a library, including tracking of
incremental progress.

To parse into a Library, there is the very useful convenience methods, "add_files" which will invoke the Parser,
collate and route the files within a selected folder path, check for duplicate files, execute the parsing,
text chunking and insertion into the database, and update all of the Library state automatically.

Libraries are the main index constructs that are used in executing a Query.   Pass the library object when
constructing the Query object, and then all retrievals (text, semantic and hybrid) will be executed against
the content in that Library only.


```python

from llmware.library import Library

#   to parse and text chunk a set of documents (pdf, pptx, docx, xlsx, txt, csv, md, json/jsonl, wav, png, jpg, html)  

#   step 1 - create a library, which is the 'knowledge-base container' construct
#          - libraries have both text collection (DB) resources, and file resources (e.g., llmware_data/accounts/{library_name})
#          - embeddings and queries are run against a library

lib = Library().create_new_library("my_library")

#    step 2 - add_files is the universal ingestion function - point it at a local file folder with mixed file types
#           - files will be routed by file extension to the correct parser, parsed, text chunked and indexed in text collection DB

lib.add_files("/folder/path/to/my/files")

#   to install an embedding on a library - pick an embedding model and vector_db
lib.install_new_embedding(embedding_model_name="mini-lm-sbert", vector_db="milvus", batch_size=500)

#   to add a second embedding to the same library (mix-and-match models + vector db)  
lib.install_new_embedding(embedding_model_name="industry-bert-sec", vector_db="chromadb", batch_size=100)

#   easy to create multiple libraries for different projects and groups

finance_lib = Library().create_new_library("finance_q4_2023")
finance_lib.add_files("/finance_folder/")

hr_lib = Library().create_new_library("hr_policies")
hr_lib.add_files("/hr_folder/")

#    pull library card with key metadata - documents, text chunks, images, tables, embedding record
lib_card = Library().get_library_card("my_library")

#   see all libraries
all_my_libs = Library().get_all_library_cards()

```


Need help or have questions?
============================

Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware).

Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions).


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
   <li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Model Catalog  
parent: Components
nav_order: 2
description: overview of the major modules and classes of LLMWare  
permalink: /components/model_catalog
---
# Model Catalog:  
Access all models the same way with easy lookup, regardless of underlying implementation. 

- 150+ Models in Catalog with 50+ RAG-optimized BLING, DRAGON and Industry BERT models
- 18 SLIM function-calling small language models for Agent use cases  
- Full support for GGUF, HuggingFace, Sentence Transformers and major API-based models
- Easy to extend to add custom models - see examples

Generally, all models can be identified using either the `model_name` or `display_name`, which provides some flexibility 
to expose a more "UI friendly" name or an informal short-name for a commonly-used model.  

The default model list is implemented in the model_configs.py module, which is then generally accessed in the models.py module through 
the `ModelCatalog` class, which also provides the ability to add models of various types, over-write by loading a custom model catalog from json file, and 
other useful interfaces into the list of models.  

```python

from llmware.models import ModelCatalog
from llmware.prompts import Prompt

#   all models accessed through the ModelCatalog
models = ModelCatalog().list_all_models()

#   to use any model in the ModelCatalog - "load_model" method and pass the model_name parameter
my_model = ModelCatalog().load_model("llmware/bling-phi-3-gguf")
output = my_model.inference("what is the future of AI?", add_context="Here is the article to read")

#   to integrate model into a Prompt
prompter = Prompt().load_model("llmware/bling-tiny-llama-v0")
response = prompter.prompt_main("what is the future of AI?", context="Insert Sources of information")
```

#  ADD a Custom GGUF to the ModelCatalog  

```python
import time
import re
from llmware.models import ModelCatalog
from llmware.prompts import Prompt

#   Step 1 - register new gguf model - we will pick the popular LLama-2-13B-chat-GGUF

ModelCatalog().register_gguf_model(model_name="TheBloke/Llama-2-13B-chat-GGUF-Q2",
                                   gguf_model_repo="TheBloke/Llama-2-13B-chat-GGUF",
                                   gguf_model_file_name="llama-2-13b-chat.Q2_K.gguf",
                                   prompt_wrapper="my_version_inst")

#   Step 2- if the prompt_wrapper is a standard, e.g., Meta's <INST>, then no need to do anything else
#   -- however, if the model uses a custom prompt wrapper, then we need to define that too
#   -- in this case, we are going to create our "own version" of the Meta <INST> wrapper

ModelCatalog().register_new_finetune_wrapper("my_version_inst", main_start="<INST>", llm_start="</INST>")

#   Once we have completed these two steps, we are done - and can begin to use the model like any other

prompter = Prompt().load_model("TheBloke/Llama-2-13B-chat-GGUF-Q2")

question_list = ["I am interested in gaining an understanding of the banking industry. What topics should I research?",
                 "What are some tips for creating a successful business plan?",
                 "What are the best books to read for a class on American literature?"]


for i, entry in enumerate(question_list):

    start_time = time.time()
    print("\n")
    print(f"query - {i + 1} - {entry}")

    response = prompter.prompt_main(entry)

    # Print results
    time_taken = round(time.time() - start_time, 2)
    llm_response = re.sub("[\n\n]", "\n", response['llm_response'])
    print(f"llm_response - {i + 1} - {llm_response}")
    print(f"time_taken - {i + 1} - {time_taken}")
```

# ADD an Ollama Model

```python

from llmware.models import ModelCatalog

#   Step 1 - register your Ollama models in llmware ModelCatalog
#   -- these two lines will register: llama2 and mistral models
#   -- note: assumes that you have previously cached and installed both of these models with ollama locally

#   register llama2
ModelCatalog().register_ollama_model(model_name="llama2",model_type="chat",host="localhost",port=11434)

#   register mistral - note: if you are using ollama defaults, then OK to register with ollama model name only
ModelCatalog().register_ollama_model(model_name="mistral")

#   optional - confirm that model was registered
my_new_model_card = ModelCatalog().lookup_model_card("llama2")
print("\nupdate: confirming - new ollama model card - ", my_new_model_card)

#   Step 2 - start using the Ollama model like any other model in llmware

print("\nupdate: calling ollama llama 2 model ...")

model = ModelCatalog().load_model("llama2")
response = model.inference("why is the sky blue?")

print("update: example #1 - ollama llama 2 response - ", response)

#   Tip: if you are loading 'llama2' chat model from Ollama, note that it is already included in
#   the llmware model catalog under a different name, "TheBloke/Llama-2-7B-Chat-GGUF"
#   the llmware model name maps to the original HuggingFace repository, and is a nod to "TheBloke" who has
#   led the popularization of GGUF - and is responsible for creating most of the GGUF model versions.
#   --llmware uses the "Q4_K_M" model by default, while Ollama generally prefers "Q4_0"

print("\nupdate: calling Llama-2-7B-Chat-GGUF in llmware catalog ...")

model = ModelCatalog().load_model("TheBloke/Llama-2-7B-Chat-GGUF")
response = model.inference("why is the sky blue?")

print("update: example #1 - [compare] - llmware / Llama-2-7B-Chat-GGUF response - ", response)

#   Now, let's try the Ollama Mistral model with a context passage

model2 = ModelCatalog().load_model("mistral")

context_passage= ("NASA’s rover Perseverance has gathered data confirming the existence of ancient lake "
                  "sediments deposited by water that once filled a giant basin on Mars called Jerezo Crater, "
                  "according to a study published on Friday.  The findings from ground-penetrating radar "
                  "observations conducted by the robotic rover substantiate previous orbital imagery and "
                  "other data leading scientists to theorize that portions of Mars were once covered in water "
                  "and may have harbored microbial life.  The research, led by teams from the University of "
                  "California at Los Angeles (UCLA) and the University of Oslo, was published in the "
                  "journal Science Advances. It was based on subsurface scans taken by the car-sized, six-wheeled "
                  "rover over several months of 2022 as it made its way across the Martian surface from the "
                  "crater floor onto an adjacent expanse of braided, sedimentary-like features resembling, "
                  "from orbit, the river deltas found on Earth.")

response = model2.inference("What are the top 3 points?", add_context=context_passage)

print("\nupdate: calling ollama mistral model ...")

print("update: example #2 - ollama mistral response - ", response)

#   Step 3 - using the ollama discovery API - optional

discovery = model2.discover_models()
print("\nupdate: example #3 - checking ollama model manifest list: ", discovery)

if len(discovery) > 0:
    # note: assumes tht you have at least one model registered in ollama -otherwise, may throw error
    for i, models in enumerate(discovery["models"]):
        print("ollama models: ", i, models)
```

# Add a LM Studio Model

```python
from llmware.models import ModelCatalog
from llmware.prompts import Prompt


#   one step process:  add the open chat model to the Model Registry
#   key params:
#       model_name      =   "my_open_chat_model1"
#       api_base        =   uri_path to the proposed endpoint
#       prompt_wrapper  =   alpaca | <INST> | chat_ml | hf_chat | human_bot
#                           <INST>      ->  Llama2-Chat
#                           hf_chat     ->  Zephyr-Mistral
#                           chat_ml     ->  OpenHermes - Mistral
#                           human_bot   ->  Dragon models
#       model_type      =   "chat" (alternative:  "completion")

ModelCatalog().register_open_chat_model("my_open_chat_model1",
                                        api_base="http://localhost:1234/v1",
                                        prompt_wrapper="<INST>",
                                        model_type="chat")

#   once registered, you can invoke like any other model in llmware

prompter = Prompt().load_model("my_open_chat_model1")
response = prompter.prompt_main("What is the future of AI?")


#   you can (optionally) register multiple open chat models with different api_base and model attributes

ModelCatalog().register_open_chat_model("my_open_chat_model2",
                                        api_base="http://localhost:5678/v1",
                                        prompt_wrapper="hf_chat",
                                        model_type="chat")
```


Need help or have questions?
============================

Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware).

Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions).


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
<li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Prompt with Sources
parent: Components
nav_order: 10
description: overview of the major modules and classes of LLMWare  
permalink: /components/prompt_with_sources
---
# Prompt with Sources
---
Prompt with Sources: the easiest way to combine knowledge retrieval with a LLM inference, and provides several high-level useful methods to 
easily integrate a retrieval/query/parsing step into a prompt to be used as a source for running an inference on a model.  

This is best illustrated with a simple example:

```python

from llmware.prompts import Prompt

#   build a prompt and attach a model
prompter = Prompt().load_model("llmware/bling-tiny-llama-v0")

#   add_source_document method: accepts any supported document type, parses the file, and creates text chunks
#   if a query is passed, then it will run a quick in-memory filtering search against the text chunks
#   the text chunks are packaged into sources with all of the accompanying metadata from the file, and made 
#   available automatically in batches to be used in prompting -

source = prompter.add_source_document("/folder/to/one/doc/", "filename", query="fast query")

#   to run inference with 'prompt with sources' -> source will be automatically added to the prompt
responses = prompter.prompt_with_source("my query")

#   depending upon the size of the source (and batching relative to the model context window, there may be more than 
#   a single inference run, so unpack potentially multiple responses

for i, response in enumerate(responses):
    print("response: ", i, response)
```

# FACT CHECKING  

Using prompt_with_source also provides integrated fact-checking methods that use the packaged source information to validate key 
elements from the llm_response

```python
from llmware.prompts import Prompt

prompter = Prompt().load_model("bling-answer-tool", temperature=0.0, sample=False)

# contract is parsed, text-chunked, and then filtered by "base salary'
source = prompter.add_source_document("/local/folder/path", "my_document.pdf", query="exact filter query")

# calling the LLM with 'source' information from the contract automatically packaged into the prompt
responses = prompter.prompt_with_source("my question to the document", prompt_name="default_with_context")
    
# run several fact checks

#   checks for numbers match
ev_numbers = prompter.evidence_check_numbers(responses)

#   looks for statistical overlap to identify potential sources for the llm response
ev_sources = prompter.evidence_check_sources(responses)

#   builds set of comparison stats between the llm_response and the sources
ev_stats = prompter.evidence_comparison_stats(responses)

#   identifies if a response is a "not found" response
z = prompter.classify_not_found_response(responses, parse_response=True, evidence_match=True,ask_the_model=False)

for r, response in enumerate(responses):
    print("LLM Response: ", response["llm_response"])
    print("Numbers: ",  ev_numbers[r]["fact_check"])
    print("Sources: ", ev_sources[r]["source_review"])
    print("Stats: ", ev_stats[r]["comparison_stats"])
    print("Not Found Check: ", z[r])
```

In addition to `add_source_document`, the Prompt class implements the following other methods to easily integrate sources into prompts:  

# Add Source - Query Results - Two Options 

```python

from llmware.prompts import Prompt
from llmware.retrieval import Query
from llmware.library import Library

#   build a prompt
prompter = Prompt().load_model("llmware/bling-tiny-llama-v0")

#   Option A - run query and then add query_results to the prompt
my_lib = Library().load_library("my_library")
results = Query(my_lib).query("my query")

source2 = prompter.add_source_query_results(results)

#   Option B - run a new query against a library and load directly into a prompt
source3 = prompter.add_source_new_query(my_lib, query="my new query", query_type="semantic", result_count=15)

```

# Add Other Sources

```python

from llmware.prompts import Prompt

#   build a prompt
prompter = Prompt().load_model("llmware/bling-tiny-llama-v0")

#   add wikipedia articles as a source
wiki_source = prompter.add_source_wikipedia("topic", article_count=5, query="filter among retrieved articles")  

#   add a website as a source
website_source = prompter.add_source_website("my_url", query="filter among website")

#   add an entire library (should be small, e.g., just a couple of documents)
source = prompter.add_source_library("my_library")

```

Need help or have questions?
============================

Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware).

Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions).


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
   <li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Query  
parent: Components
nav_order: 8
description: overview of the major modules and classes of LLMWare  
permalink: /components/query
---
# Retrieval & Query
---

Query libraries with mix of text, semantic, hybrid, metadata, and custom filters.  The retrieval.py module implements the 
`Query` class, which is the primary way that search and retrieval is performed.   Each `Query` object, when constructed, 
requires that a Library is passed as a mandatory parameter in the constructor.  The Query object will operate against that 
Library, and have access to all of Library's specific attributes, metadata and methods.  

Retrievals in llmware leverage the Library abstraction as the primary unit against which a particular query or retrieval is 
executed.  This provides the ability to have multiple distinct knowledge-bases, potentially aligned to different use cases, and/or 
users, accounts and permissions.  

# Executing Queries

```python
from llmware.retrieval import Query
from llmware.library import Library

#   step 1 - load a previously created library
lib = Library().load_library("my_library")

#   step 2 - create a query object
q = Query(lib)

#    step 3 - run lots of different queries  (many other options in the examples)

#    basic text query
results1 = q.text_query("text query", result_count=20, exact_mode=False)

#    semantic query
results2 = q.semantic_query("semantic query", result_count=10)

#    combining a text query restricted to only certain documents in the library and "exact" match to the query
results3 = q.text_query_with_document_filter("new query", {"file_name": "selected file name"}, exact_mode=True)

#   to apply a specific embedding (if multiple on library), pass the names when creating the query object
q2 = Query(lib, embedding_model_name="mini_lm_sbert", vector_db="milvus")
results4 = q2.semantic_query("new semantic query")
```


Need help or have questions?
============================

Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware).

Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions).


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
 <li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: RAG Optimized Models
parent: Components
nav_order: 3
description: overview of the major modules and classes of LLMWare  
permalink: /components/rag_optimized_models
---
# RAG Optimized Models
---

RAG-Optimized Models -  1-7B parameter models designed for RAG workflow integration and running locally. </summary>  

## Meet our Models   

- **SLIM model series:** small, specialized models fine-tuned for function calling and multi-step, multi-model Agent workflows.  
- **DRAGON model series:**  Production-grade RAG-optimized 6-7B parameter models - "Delivering RAG on ..." the leading foundation base models.  
- **BLING model series:**  Small CPU-based RAG-optimized, instruct-following 1B-3B parameter models.  
- **Industry BERT models:**  out-of-the-box custom trained sentence transformer embedding models fine-tuned for the following industries:  Insurance, Contracts, Asset Management, SEC.  
- **GGUF Quantization:** we provide 'gguf' and 'tool' versions of many SLIM, DRAGON and BLING models, optimized for CPU deployment.  


```python
""" This 'Hello World' example demonstrates how to get started using local BLING models with provided context, using both
Pytorch and GGUF versions. """

import time
from llmware.prompts import Prompt


def hello_world_questions():

    test_list = [

    {"query": "What is the total amount of the invoice?",
     "answer": "$22,500.00",
     "context": "Services Vendor Inc. \n100 Elm Street Pleasantville, NY \nTO Alpha Inc. 5900 1st Street "
                "Los Angeles, CA \nDescription Front End Engineering Service $5000.00 \n Back End Engineering"
                " Service $7500.00 \n Quality Assurance Manager $10,000.00 \n Total Amount $22,500.00 \n"
                "Make all checks payable to Services Vendor Inc. Payment is due within 30 days."
                "If you have any questions concerning this invoice, contact Bia Hermes. "
                "THANK YOU FOR YOUR BUSINESS!  INVOICE INVOICE # 0001 DATE 01/01/2022 FOR Alpha Project P.O. # 1000"},

    {"query": "What was the amount of the trade surplus?",
     "answer": "62.4 billion yen ($416.6 million)",
     "context": "Japan’s September trade balance swings into surplus, surprising expectations"
                "Japan recorded a trade surplus of 62.4 billion yen ($416.6 million) for September, "
                "beating expectations from economists polled by Reuters for a trade deficit of 42.5 "
                "billion yen. Data from Japan’s customs agency revealed that exports in September "
                "increased 4.3% year on year, while imports slid 16.3% compared to the same period "
                "last year. According to FactSet, exports to Asia fell for the ninth straight month, "
                "which reflected ongoing China weakness. Exports were supported by shipments to "
                "Western markets, FactSet added. — Lim Hui Jie"},

    {"query": "When did the LISP machine market collapse?",
     "answer": "1987.",
     "context": "The attendees became the leaders of AI research in the 1960s."
                "  They and their students produced programs that the press described as 'astonishing': "
                "computers were learning checkers strategies, solving word problems in algebra, "
                "proving logical theorems and speaking English.  By the middle of the 1960s, research in "
                "the U.S. was heavily funded by the Department of Defense and laboratories had been "
                "established around the world. Herbert Simon predicted, 'machines will be capable, "
                "within twenty years, of doing any work a man can do'.  Marvin Minsky agreed, writing, "
                "'within a generation ... the problem of creating 'artificial intelligence' will "
                "substantially be solved'. They had, however, underestimated the difficulty of the problem.  "
                "Both the U.S. and British governments cut off exploratory research in response "
                "to the criticism of Sir James Lighthill and ongoing pressure from the US Congress "
                "to fund more productive projects. Minsky's and Papert's book Perceptrons was understood "
                "as proving that artificial neural networks approach would never be useful for solving "
                "real-world tasks, thus discrediting the approach altogether.  The 'AI winter', a period "
                "when obtaining funding for AI projects was difficult, followed.  In the early 1980s, "
                "AI research was revived by the commercial success of expert systems, a form of AI "
                "program that simulated the knowledge and analytical skills of human experts. By 1985, "
                "the market for AI had reached over a billion dollars. At the same time, Japan's fifth "
                "generation computer project inspired the U.S. and British governments to restore funding "
                "for academic research. However, beginning with the collapse of the Lisp Machine market "
                "in 1987, AI once again fell into disrepute, and a second, longer-lasting winter began."},

    {"query": "What is the current rate on 10-year treasuries?",
     "answer": "4.58%",
     "context": "Stocks rallied Friday even after the release of stronger-than-expected U.S. jobs data "
                "and a major increase in Treasury yields.  The Dow Jones Industrial Average gained 195.12 points, "
                "or 0.76%, to close at 31,419.58. The S&P 500 added 1.59% at 4,008.50. The tech-heavy "
                "Nasdaq Composite rose 1.35%, closing at 12,299.68. The U.S. economy added 438,000 jobs in "
                "August, the Labor Department said. Economists polled by Dow Jones expected 273,000 "
                "jobs. However, wages rose less than expected last month.  Stocks posted a stunning "
                "turnaround on Friday, after initially falling on the stronger-than-expected jobs report. "
                "At its session low, the Dow had fallen as much as 198 points; it surged by more than "
                "500 points at the height of the rally. The Nasdaq and the S&P 500 slid by 0.8% during "
                "their lowest points in the day.  Traders were unclear of the reason for the intraday "
                "reversal. Some noted it could be the softer wage number in the jobs report that made "
                "investors rethink their earlier bearish stance. Others noted the pullback in yields from "
                "the day’s highs. Part of the rally may just be to do a market that had gotten extremely "
                "oversold with the S&P 500 at one point this week down more than 9% from its high earlier "
                "this year.  Yields initially surged after the report, with the 10-year Treasury rate trading "
                "near its highest level in 14 years. The benchmark rate later eased from those levels, but "
                "was still up around 6 basis points at 4.58%.  'We’re seeing a little bit of a give back "
                "in yields from where we were around 4.8%. [With] them pulling back a bit, I think that’s "
                "helping the stock market,' said Margaret Jones, chief investment officer at Vibrant Industries "
                "Capital Advisors. 'We’ve had a lot of weakness in the market in recent weeks, and potentially "
                "some oversold conditions.'"},

    {"query": "Is the expected gross margin greater than 70%?",
     "answer": "Yes, between 71.5% and 72.%",
     "context": "Outlook NVIDIA’s outlook for the third quarter of fiscal 2024 is as follows:"
                "Revenue is expected to be $16.00 billion, plus or minus 2%. GAAP and non-GAAP "
                "gross margins are expected to be 71.5% and 72.5%, respectively, plus or minus "
                "50 basis points.  GAAP and non-GAAP operating expenses are expected to be "
                "approximately $2.95 billion and $2.00 billion, respectively.  GAAP and non-GAAP "
                "other income and expense are expected to be an income of approximately $100 "
                "million, excluding gains and losses from non-affiliated investments. GAAP and "
                "non-GAAP tax rates are expected to be 14.5%, plus or minus 1%, excluding any discrete items."
                "Highlights NVIDIA achieved progress since its previous earnings announcement "
                "in these areas:  Data Center Second-quarter revenue was a record $10.32 billion, "
                "up 141% from the previous quarter and up 171% from a year ago. Announced that the "
                "NVIDIA® GH200 Grace™ Hopper™ Superchip for complex AI and HPC workloads is shipping "
                "this quarter, with a second-generation version with HBM3e memory expected to ship "
                "in Q2 of calendar 2024. "},

    {"query": "What is Bank of America's rating on Target?",
     "answer": "Buy",
     "context": "Here are some of the tickers on my radar for Thursday, Oct. 12, taken directly from "
                "my reporter’s notebook: It’s the one-year anniversary of the S&P 500′s bear market bottom "
                "of 3,577. Since then, as of Wednesday’s close of 4,376, the broad market index "
                "soared more than 22%.  Hotter than expected September consumer price index, consumer "
                "inflation. The Social Security Administration issues announced a 3.2% cost-of-living "
                "adjustment for 2024.  Chipotle Mexican Grill (CMG) plans price increases. Pricing power. "
                "Cites consumer price index showing sticky retail inflation for the fourth time "
                "in two years. Bank of America upgrades Target (TGT) to buy from neutral. Cites "
                "risk/reward from depressed levels. Traffic could improve. Gross margin upside. "
                "Merchandising better. Freight and transportation better. Target to report quarter "
                "next month. In retail, the CNBC Investing Club portfolio owns TJX Companies (TJX), "
                "the off-price juggernaut behind T.J. Maxx, Marshalls and HomeGoods. Goldman Sachs "
                "tactical buy trades on Club names Wells Fargo (WFC), which reports quarter Friday, "
                "Humana (HUM) and Nvidia (NVDA). BofA initiates Snowflake (SNOW) with a buy rating."
                "If you like this story, sign up for Jim Cramer’s Top 10 Morning Thoughts on the "
                "Market email newsletter for free. Barclays cuts price targets on consumer products: "
                "UTZ Brands (UTZ) to $16 per share from $17. Kraft Heinz (KHC) to $36 per share from "
                "$38. Cyclical drag. J.M. Smucker (SJM) to $129 from $160. Secular headwinds. "
                "Coca-Cola (KO) to $59 from $70. Barclays cut PTs on housing-related stocks: Toll Brothers"
                "(TOL) to $74 per share from $82. Keeps underweight. Lowers Trex (TREX) and Azek"
                "(AZEK), too. Goldman Sachs (GS) announces sale of fintech platform and warns on "
                "third quarter of 19-cent per share drag on earnings. The buyer: investors led by "
                "private equity firm Sixth Street. Exiting a mistake. Rise in consumer engagement for "
                "Spotify (SPOT), says Morgan Stanley. The analysts hike price target to $190 per share "
                "from $185. Keeps overweight (buy) rating. JPMorgan loves elf Beauty (ELF). Keeps "
                "overweight (buy) rating but lowers price target to $139 per share from $150. "
                "Sees “still challenging” environment into third-quarter print. The Club owns shares "
                "in high-end beauty company Estee Lauder (EL). Barclays upgrades First Solar (FSLR) "
                "to overweight from equal weight (buy from hold) but lowers price target to $224 per "
                "share from $230. Risk reward upgrade. Best visibility of utility scale names."},

    {"query": "What was the rate of decline in 3rd quarter sales?",
     "answer": "20% year-on-year.",
     "context": "Nokia said it would cut up to 14,000 jobs as part of a cost cutting plan following "
                "third quarter earnings that plunged. The Finnish telecommunications giant said that "
                "it will reduce its cost base and increase operation efficiency to “address the "
                "challenging market environment. The substantial layoffs come after Nokia reported "
                "third-quarter net sales declined 20% year-on-year to 4.98 billion euros. Profit over "
                "the period plunged by 69% year-on-year to 133 million euros."},

    {"query": "What is a list of the key points?",
     "answer": "•Stocks rallied on Friday with stronger-than-expected U.S jobs data and increase in "
               "Treasury yields;\n•Dow Jones gained 195.12 points;\n•S&P 500 added 1.59%;\n•Nasdaq Composite rose "
               "1.35%;\n•U.S. economy added 438,000 jobs in August, better than the 273,000 expected;\n"
               "•10-year Treasury rate trading near the highest level in 14 years at 4.58%.",
     "context": "Stocks rallied Friday even after the release of stronger-than-expected U.S. jobs data "
               "and a major increase in Treasury yields.  The Dow Jones Industrial Average gained 195.12 points, "
               "or 0.76%, to close at 31,419.58. The S&P 500 added 1.59% at 4,008.50. The tech-heavy "
               "Nasdaq Composite rose 1.35%, closing at 12,299.68. The U.S. economy added 438,000 jobs in "
               "August, the Labor Department said. Economists polled by Dow Jones expected 273,000 "
               "jobs. However, wages rose less than expected last month.  Stocks posted a stunning "
               "turnaround on Friday, after initially falling on the stronger-than-expected jobs report. "
               "At its session low, the Dow had fallen as much as 198 points; it surged by more than "
               "500 points at the height of the rally. The Nasdaq and the S&P 500 slid by 0.8% during "
               "their lowest points in the day.  Traders were unclear of the reason for the intraday "
               "reversal. Some noted it could be the softer wage number in the jobs report that made "
               "investors rethink their earlier bearish stance. Others noted the pullback in yields from "
               "the day’s highs. Part of the rally may just be to do a market that had gotten extremely "
               "oversold with the S&P 500 at one point this week down more than 9% from its high earlier "
               "this year.  Yields initially surged after the report, with the 10-year Treasury rate trading "
               "near its highest level in 14 years. The benchmark rate later eased from those levels, but "
               "was still up around 6 basis points at 4.58%.  'We’re seeing a little bit of a give back "
               "in yields from where we were around 4.8%. [With] them pulling back a bit, I think that’s "
               "helping the stock market,' said Margaret Jones, chief investment officer at Vibrant Industries "
               "Capital Advisors. 'We’ve had a lot of weakness in the market in recent weeks, and potentially "
               "some oversold conditions.'"}

    ]

    return test_list


# this is the main script to be run

def bling_meets_llmware_hello_world (model_name):

    t0 = time.time()

    # load the questions
    test_list = hello_world_questions()

    print(f"\n > Loading Model: {model_name}...")

    # load the model 
    prompter = Prompt().load_model(model_name)

    t1 = time.time()
    print(f"\n > Model {model_name} load time: {t1-t0} seconds")
 
    for i, entries in enumerate(test_list):

        print(f"\n{i+1}. Query: {entries['query']}")
     
        # run the prompt
        output = prompter.prompt_main(entries["query"],context=entries["context"]
                                      , prompt_name="default_with_context",temperature=0.30)

        # print out the results
        llm_response = output["llm_response"].strip("\n")
        print(f"LLM Response: {llm_response}")
        print(f"Gold Answer: {entries['answer']}")
        print(f"LLM Usage: {output['usage']}")

    t2 = time.time()

    print(f"\nTotal processing time: {t2-t1} seconds")

    return 0


if __name__ == "__main__":

    # list of 'rag-instruct' laptop-ready small bling models on HuggingFace

    pytorch_models = ["llmware/bling-1b-0.1",                    #  most popular
                      "llmware/bling-tiny-llama-v0",             #  fastest 
                      "llmware/bling-1.4b-0.1",
                      "llmware/bling-falcon-1b-0.1",
                      "llmware/bling-cerebras-1.3b-0.1",
                      "llmware/bling-sheared-llama-1.3b-0.1",    
                      "llmware/bling-sheared-llama-2.7b-0.1",
                      "llmware/bling-red-pajamas-3b-0.1",
                      "llmware/bling-stable-lm-3b-4e1t-v0",
                      "llmware/bling-phi-3"                      # most accurate (and newest)  
                      ]

    #  Quantized GGUF versions generally load faster and run nicely on a laptop with at least 16 GB of RAM
    gguf_models = ["bling-phi-3-gguf", "bling-stablelm-3b-tool", "dragon-llama-answer-tool", "dragon-yi-answer-tool", "dragon-mistral-answer-tool"]

    #   try model from either pytorch or gguf model list
    #   the newest (and most accurate) is 'bling-phi-3-gguf'  

    bling_meets_llmware_hello_world(gguf_models[0])

    #   check out the model card on Huggingface for RAG benchmark test performance results and other useful information
```


Need help or have questions?
============================

Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware).

Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions).


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
   <li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Release History 
parent: Components  
nav_order: 15
description: llmware is an integrated framework with over 50+ models for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows.
permalink: /components/release_history
---
Release History
---

- For Specific Wheels:  [Wheel Archives](https://www.github.com/llmware-ai/llmware/tree/main/wheel_archives)  
- For Features Details: [Main README-'Release notes and Change Log'](https://www.github.com/llmware-ai/llmware/tree/main/)  

New wheels are built generally on PyPy on a weekly basis and updated on PyPy versioning.   The development repo is updated  
and current at all times, but may have updates that are not yet in the PyPy wheel.  

All wheels are built and tested on:  

1.  Mac Metal  
2.  Windows x86 (+ with CUDA)  
3.  Linux x86 (+ with CUDA) - most testing on Ubuntu 22 and Ubuntu 20 - which are recommended.  
4.  Mac x86 (see 0.2.11 note below)  
5.  Linux aarch64* (see 0.2.7 note below)  

**Release Notes**  

--**0.3.0** released in the week of June 4, 2024 - continued pruning of required dependencies with split of python dependencies into a small minimal set of requirements (~10 in requirements.txt) that are included in the pip install, with an additional set of optional dependencies provided as 'extras', reflected in both the requirements_extras.txt file, and available over pip with the added instruction - `pip3 install 'llmware[full]'`.  Notably, commonly used libraries such as transformers, torch and openai are now in the 'extras' as most llmware use cases do not require them, and this greatly simplifies the ability to install llmware.  The `welcome_to_llmware.sh` and `welcome_to_llmware_windows.sh` have also been updated to install both the 'core' and 'extra' set of requirements.  Other subtle, but significant, architectural changes include offering more extensibility for adding new model classes, configurable global base model methods for post_init and register, a new InferenceHistory state manager, and enhanced logging options.  

--**0.2.15** released in the week of May 20, 2024 - removed pytorch dependency as a global import, and shifted to dynamically loading of torch in the event that it is called in a specific model class.   This enables running most of llmware code and examples without pytorch or transformers loaded.   The main areas of torch (and transformers) dependency is in using HFGenerativeModels and HFEmbeddingModels.   

  - note: we have seen some new errors caused with Pytorch 2.3 - which are resolved by down-leveling to `pip3 install torch==2.1`  
  - note: there are a couple of new warnings from within transformers and huggingface_hub libraries - these can be safely ignored.  We have seen that keeping `local_dir_use_symlinks = False` when pulling model artifacts from Huggingface is still the safer option in some environments.   

--**0.2.13** released in the week of May 12, 2024 - clean up of dependencies in both requirements.txt and Setup (PyPi) - install of vector db python sdk (e.g., pymilvus, chromadb, etc) is now required as a separate step outside of the pip3 install llmware - attempt to keep dependency matrix as simple as possible and avoid potential dependency conflicts on install, especially for packages which in turn have a large number of dependencies.  If you run into any issues with install dependencies, please raise an issue. 

    
--**0.2.12** released in the week of May 5, 2024 - added Python 3.12 support, and deprecated the use of faiss for v3.12+.   We have changed the "Fast Start" no-install option to use chromadb or lancedb rather than faiss.   Refactoring of code especially with Datasets, Graph and Web Services as separate modules.  

--**0.2.11** released in the week of April 29, 2024 - updated GGUF libs for Phi-3 and Llama-3 support, and added new prebuilt shared libraries to support WhisperCPP.  We are also deprecating support for Mac x86 going forward - will continue to support on most major components but not all new features going forward will be built specifically for Mac x86 (which Apple stopped shipping in 2022).  Our intent is to keep narrowing our testing matrix to provide better support on key platforms.  We have also added better safety checks for older versions of Mac OS running on M1/M2/M3 (no_acc option in GGUF and Whisper libs), as well as a custom check to find CUDA drivers on Windows (independent of Pytorch).  

--**0.2.9** released in the week of April 15, 2024 - minor continued improvements to the parsers plus roll-out of new CustomTable class for rapidly integrating structured information into LLM-based workflows and data pipelines, including converting JSON/JSONL files and CSV files into structured DB tables.  
  
--**0.2.8** released in the week of April 8, 2024 - significant improvements to the Office parser with new libs on all platforms.   Conforming changes with the PDF parser in terms of exposing more options for text chunking strategies, encoding, and range of capture options (e.g., tables, images, header text, etc).  Linux aarch64 libs deprecated and kept at 0.2.6 - some new features will not be available on Linux aarch64 - we recommend using Ubuntu20+ on x86_64 (with and without CUDA).  

--**0.2.7** released in the week of April 1, 2024 - significant improvements to the PDF parser with new libs on all platforms.   Important note that we are keeping linux aarch64 at 0.2.6 libs - and will be deprecating support going forward.  For Linux, we recommend Ubuntu20+ and x86_64 (with and without CUDA).  

--**0.2.5** released in the week of March 12, 2024 - continued enhancements of the GGUF implementation, especially for CUDA support, and re-compiling of all binaries to support Ubuntu 20 and Ubuntu 22.  Ubuntu requirements are:  CUDA 12.1 (to use GPU), and GLIBC 2.31+.  

--**GGUF on Windows CUDA**: useful notes and debugging tips -  

    1.  Requirement:  Nvidia CUDA 12.1+  
    
        -- how to check:  `nvcc --version` and `nvidia-smi` - if not found, then drivers are either not installed or not in $PATH and need to be configured 
        -- if you have older drivers (e.g., v11), then you will need to update them.  
        
    2.  Requirement:  CUDA-enabled Pytorch  (pre-0.2.11)  
    
        -- starting with 0.2.11, we have implemented a custom check to evaluate if CUDA is present, independent of Pytorch.  
        -- for pre-0.2.11, we use Pytorch to check for CUDA drivers, e.g., `torch.cuda.is_available()` and `torch.version.cuda`  

    3.  Installing a CUDA-enabled Pytorch - useful install script:  (not required post-0.2.11 for GGUF on Windows)  
    
        -- `pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121`  

    4.  Fall-back to CPU - if llmware can not load the CUDA-enabled drivers, it will automatically try to fall back to the CPU version of the drivers.  
    
        -- you can also adjust the GGUFConfigs().set_config - ("use_gpu", False) - and then it will automatically go to the CPU drivers.  

    5.  Custom GGUF libraries - if you have a unique system requirement, you can build llama_cpp from source, and apply custom build settings - or find in the community a prebuilt llama_cpp library that matches your platform.  Happy to help if you share the requirements.  

        -- to "bring your own GGUF":  GGUFConfigs().set_config("custom_lib_path", "/path/to/your/custom/llama_cpp_backend" -> and llmware will try to load that library.  

    6.  Issues?  - please raise an Issue on Github, or on Discord - and we can work with you to get you up and running!  
    
--**0.2.4** released in the week of February 26, 2024 - major upgrade of GGUF implementation to support more options, including CUDA support - which is the main source of growth in the size of the wheel package.   

  -- Note: We will look at making some of the CUDA builds as 'optional' or 'bring your own' over time.    
  -- Note: We will also start to 'prune' the list of wheels kept in the archive to keep the total repo size manageable for cloning.  

--**0.2.2** introduced SLIM models and the new LLMfx class, and the capabilities for multi-model, multi-step Agent-based processes.  

--**0.2.0** released in the week of January 22, 2024 - significant enhancements, including integration of Postgres and SQLite drivers into the c lib parsers.  

--New examples involving Postgres or SQLite support (including 'Fast Start' examples) will require a fresh pip install of 0.2.0 or clone of the repo.  

--If cloning the repo, please be especially careful to pick up the new updated /lib dependencies for your platform.  

--New libs have new dependencies in Linux in particular - most extensive testing on Ubuntu 22. If any issues on a specific version of Linux, please raise a ticket.  


# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
  <li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: SLIM Models  
parent: Components
nav_order: 5
description: overview of the major modules and classes of LLMWare  
permalink: /components/slim_models
---
# SLIM Models - Function Calling with Small Language Models  
---

Generally, function-calling is a specialized capability of frontier language models, such as OpenAI GPT4. 

We have adapted this concept to small language models through SLIMs (Structured Language Instruction Models), 
which are 'single function' models fine-tuned to accept three main inputs to construct a prompt:

As of June 2024, there are 18 distinct SLIM function calling models with many more on the way, for most common
extraction, classification, and summarization tasks:  

**Models List**  
If you would like more information about any of the SLIM models, please check out their model card:  

- extract - extract custom keys - [slim-extract](https://www.huggingface.co/llmware/slim-extract) & [slim-extract-tool](https://www.huggingface.co/llmware/slim-extract-tool)
- summary - summarize function call - [slim-summary](https://www.huggingface.co/llmware/slim-summary) & [slim-summary-tool](https://www.huggingface.co/llmware/slim-summary-tool)
- xsum - title/headline function call - [slim-xsum](https://www.huggingface.co/llmware/slim-xsum) & [slim-xsum-tool](https://www.huggingface.co/llmware/slim-xsum-tool)  
- ner - extract named entities  - [slim-ner](https://www.huggingface.co/llmware/slim-ner) & [slim-ner-tool](https://www.huggingface.co/llmware/slim-ner-tool)
- sentiment - evaluate sentiment - [slim-sentiment](https://www.huggingface.co/slim-sentiment) & [slim-sentiment-tool](https://www.huggingface.co/llmware/slim-sentiment-tool)    
- topics - generate topic - [slim-topics](https://www.huggingface.co/slim-topics) & [slim-topics-tool](https://www.huggingface.co/llmware/slim-topics-tool)
- sa-ner - combo model (sentiment + named entities) - [slim-sa-ner](https://www.huggingface.co/slim-sa-ner) & [slim-sa-ner-tool](https://www.huggingface.co/llmware/slim-sa-ner-tool)
- boolean - provides a yes/no output with explanation - [slim-boolean](https://www.huggingface.co/slim-boolean) & [slim-boolean-tool](https://www.huggingface.com/llmware/slim-boolean-tool)  
- ratings - apply 1 (low) - 5 (high) rating - [slim-ratings](https://www.huggingface.co/slim-ratings) & [slim-ratings-tool](https://www.huggingface.co/llmware/slim-ratings-tool)  
- emotions - assess emotions - [slim-emotions](https://www.huggingface.co/slim-emotions) & [slim-emotions-tool](https://www.huggingface.co/llmware/slim-emotions-tool)  
- tags - auto-generate list of tags - [slim-tags](https://www.huggingface.co/slim-tags) & [slim-tags-tool](https://www.huggingface.co/llmware/slim-tags-tool)
- tags-3b - enhanced auto-generation tagging model - [slim-tags-3b](https://www.huggingface.com/slim-tags-3b) & [slim-tags-3b-tool](https://www.huggingface.co/llmware/slim-tags-3b-tool)  
- intent - identify intent - [slim-intent](https://www.huggingface.co/slim-intent) & [slim-intent-tool](https://www.huggingface.co/llmware/slim-intent-tool)  
- category - high-level category - [slim-category](https://www.huggingface.co/slim-category) & [slim-category-tool](https://wwww.huggingface.co/llmware/slim-category-tool)
- nli - assess if evidence supports conclusion - [slim-nli](https://www.huggingface.co/slim-nli) & [slim-nli-tool](https://www.huggingface.co/llmware/slim-nli-tool)  
- sql - convert text into sql - [slim-sql](https://www.huggingface.co/slim-sql) & [slim-sql-tool](https://www.huggingface.co/llmware/slim-sql-tool)  

You may also want to check out these quantized 'answer' tools, which work well in conjunction with SLIMs for question-answer and summarization:  
- bling-stablelm-3b-tool - 3b quantized RAG model - [bling-stablelm-3b-gguf](https://www.huggingface.co/llmware/bling-stablelm-3b-gguf)  
- bling-answer-tool - 1b quantized RAG model - [bling-answer-tool](https://www.huggingface.co/llmware/bling-answer-tool)  
- dragon-yi-answer-tool - 6b quantized RAG model - [dragon-yi-answer-tool](https://www.huggingface.co/llmware/dragon-yi-answer-tool)  
- dragon-mistral-answer-tool - 7b quantized RAG model - [dragon-mistral-answer-tool](https://www.huggingface.co/llmware/dragon-mistral-answer-tool)  
- dragon-llama-answer-tool - 7b quantized RAG model - [dragon-llama-answer-tool](https://www.huggingface.co/llmware/dragon-llama-answer-tool)  

All SLIM models have a common prompting structure

Inputs:
  -- text passage - this is the core passage or piece of text that you would like the model to assess
  -- function - classify, extract, generate - this is handled by default by the model class, so usually does
                not need to be explicitly declared - but is an option for SLIMs that support more than one function
  -- params - depends upon the model, used to configure/guide the behavior of the function call - optional for
                some SLIMs

Outputs:
   -- structured python output, generally either a dictionary or list

Main objectives:
   -- enable function calling with small, locally-running models,
   -- simplify prompts by defining specific functions and fine-tuning the model to respond accordingly
            without 'prompt magic'
   -- standardized outputs that can be handled programmatically as part of a multi-step workflow.
    

```python


from llmware.models import ModelCatalog


def discover_slim_models():

    """ Discover a list of SLIM tools in the Model Catalog.

    -- SLIMs are available in both traditional Pytorch and quantized GGUF packages.
    -- Generally, we train/fine-tune in Pytorch and then package in 4-bit quantized GGUF for inference.
    -- By default, we designate the GGUF versions with 'tool' or 'gguf' in their names.
    -- GGUF versions are generally faster to load, faster for inference and use less memory in most environments."""

    tools = ModelCatalog().list_llm_tools()
    tool_map = ModelCatalog().get_llm_fx_mapping()

    print("\nList of SLIM model tools (GGUF) in the ModelCatalog\n")

    for i, tool in enumerate(tools):
        model_card = ModelCatalog().lookup_model_card(tool_map[tool])
        print(f"{i} - tool: {tool} - "
              f"model_name: {model_card['model_name']} - "
              f"model_family: {model_card['model_family']}")

    return 0


def hello_world_slim():

    """ SLIM models can be identified in the ModelCatalog like any llmware model.  Instead of using
    inference method, SLIM models are used with the function_call method that prepares a special prompt
    instruction, and takes optional parameters.

    This example shows a series of function calls with different SLIM models.

    Please note that the first time the models will be pulled from the llmware Huggingface repository, and will
    take a couple of minutes.  Future calls will be much faster once cached in memory locally. """

    print("\nExecuting Function Call Inferences with SLIMs\n")

    #   Sentiment Analysis

    passage1 = ("This is one of the best quarters we can remember for the industrial sector "
               "with significant growth across the board in new order volume, as well as price "
               "increases in excess of inflation.  We continue to see very strong demand, especially "
               "in Asia and Europe. Accordingly, we remain bullish on the tier 1 suppliers and would "
               "be accumulating more stock on any dips.")

    #   here are the two key lines of code
    model = ModelCatalog().load_model("slim-sentiment-tool")
    response = model.function_call(passage1)

    print("sentiment response: ", response['llm_response'])

    #  Named Entity Recognition

    passage2 = "Michael Johnson was a famous Olympic sprinter from the U.S. in the early 2000s."

    model = ModelCatalog().load_model("slim-ner-tool")
    response = model.function_call(passage2)

    print("ner response: ", response['llm_response'])

    #   Extract anything with Slim-extract

    passage3 = ("Adobe shares tumbled as much as 11% in extended trading Thursday after the design software maker "
    "issued strong fiscal first-quarter results but came up slightly short on quarterly revenue guidance. "
    "Here’s how the company did, compared with estimates from analysts polled by LSEG, formerly known as Refinitiv: "
    "Earnings per share: $4.48 adjusted vs. $4.38 expected Revenue: $5.18 billion vs. $5.14 billion expected "
    "Adobe’s revenue grew 11% year over year in the quarter, which ended March 1, according to a statement. "
    "Net income decreased to $620 million, or $1.36 per share, from $1.25 billion, or $2.71 per share, "
    "in the same quarter a year ago. During the quarter, Adobe abandoned its $20 billion acquisition of "
    "design software startup Figma after U.K. regulators found competitive concerns. The company paid "
    "Figma a $1 billion termination fee.")

    model = ModelCatalog().load_model("slim-extract-tool")
    response = model.function_call(passage3, function="extract", params=["revenue growth"])

    print("extract response: ", response['llm_response'])

    #   Generate questions with Slim-Q-Gen

    model = ModelCatalog().load_model("slim-q-gen-tiny-tool", temperature=0.2, sample=True)
    #   supported params - "question", "multiple choice", "boolean"
    response = model.function_call(passage3, params=['multiple choice'])

    print("question generation response: ", response['llm_response'])

    #   Generate topic

    model = ModelCatalog().load_model("slim-topics-tool")
    response = model.function_call(passage3)

    print("topics response: ", response['llm_response'])

    #   Generate headline summary with slim-xsum
    model = ModelCatalog().load_model("slim-xsum-tool", temperature=0.0, sample=False)
    response = model.function_call(passage3)

    print("xsum response: ", response['llm_response'])

    #   Generate boolean with optional '(explain)` in parameter
    model = ModelCatalog().load_model("slim-boolean-tool")
    response = model.function_call(passage3, params=["Did Adobe revenue increase? (explain)"])

    print("boolean response: ", response['llm_response'])

    #   Generate tags
    model = ModelCatalog().load_model("slim-tags-tool", temperature=0.0, sample=False)
    response = model.function_call(passage3)

    print("tags response: ", response['llm_response'])

    return 0


def using_logits_and_integrating_into_process():

    """ This example shows two key elements of function calling SLIM models -

    1.  Using Logit Information to indicate confidence levels, especially for classifications.
    2.  Using the structured dictionary generated for programmatic handling in a larger process.

    """

    print("\nExample: using logits and integrating into process\n")

    text_passage = ("On balance, this was an average result, with earnings in line with expectations and "
                    "no big surprises to either the positive or the negative.")

    #   two key lines (load_model + execute function_call) + additional logit_analysis step
    sentiment_model = ModelCatalog().load_model("slim-sentiment-tool", get_logits=True)
    response = sentiment_model.function_call(text_passage)
    analysis = ModelCatalog().logit_analysis(response,sentiment_model.model_card, sentiment_model.hf_tokenizer_name)

    print("sentiment response: ", response['llm_response'])

    print("\nAnalyzing response")
    for keys, values in analysis.items():
        print(f"{keys} - {values}")

    #   two key attributes of the sentiment output dictionary
    sentiment_value = response["llm_response"]["sentiment"]
    confidence_level = analysis["confidence_score"]

    #   use the sentiment classification as a 'if...then' decision point in a process
    if "positive" in sentiment_value:
        print("sentiment is positive .... will take 'positive' analysis path ...", sentiment_value)
    else:
        print("sentiment is negative .... will take 'negative' analysis path ...", sentiment_value)

    if "positive" in sentiment_value and confidence_level > 0.8:
        print("sentiment is positive with high confidence ... ", sentiment_value, confidence_level)

    return 0


if __name__ == "__main__":

    #   discovering slim models in the llmware catalog
    discover_slim_models()

    #   running function call inferences
    hello_world_slim()

    #   doing interesting stuff with the output
    using_logits_and_integrating_into_process()

```


Need help or have questions?
============================

Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware).

Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions).


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
   <li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Vector Databases 
parent: Components
nav_order: 11
description: overview of the major modules and classes of LLMWare  
permalink: /components/vector_databases
---
# Vector Databases
---

llmware supports the following vector databases:  

  - Milvus and Milvus-Lite  - `milvus`  
  - Postgres (PG Vector)   - `postgres`  
  - Qdrant - `qdrant`  
  - ChromaDB - `chromadb`  
  - Redis - `redis`  
  - Neo4j - `neo4j`  
  - LanceDB - `lancedb`  
  - FAISS - `faiss`  
  - Mongo-Atlas - `mongo-atlas`  
  - Pinecone  - `pinecone`  

In llmware, unstructured content is ingested and organized into a Library, and then embeddings are created against the
Library object, and usually, handled by implicitly through the Library method `.install_new_embedding`.  

All embedding models are implemented through the embeddings.py module, and the `EmbeddingHandler` class, which routes 
the embedding process to the vector db specific handler and provides a common set of utility functions.   
In most cases, it is not necessarily to explicitly call the vector db class.   

The design is intended to promote code re-use and to make it easy to experiment with different endpoint vector databases 
without significant code changes, as well as to leverage the Library as the core organizing construct.  

#  Select Vector DB
To select a vector database in llmware is generally done is one of two ways:  

1. Explicit Setting - `LLMWareConfig().set_vector_db("postgres")`  

2. Pass the name of the vector database at the time of installing the embeddings:  

    `library.install_new_embedding(embedding_model_name=embedding_model, vector_db='milvus',batch_size=100)`

#  Install Vector DB  

No-install options:  chromadb, lancedb, faiss, and milvus-lite  

API-based options:  mongo-atlas, pinecone

Install server options:  

Generally, we have found that Docker (and Docker-Compose) are the easiest and most consistent ways to install vector 
db across different platforms.   

1.  milvus - we provide a docker-compose script in the main repository root folder path, which installs mongodb as well.

```bash 
curl -o docker-compose.yaml https://raw.githubusercontent.com/llmware-ai/llmware/main/docker-compose_mongo_milvus.yaml
docker compose up -d
```  

2.  qdrant  

```bash
curl -o docker-compose.yaml https://raw.githubusercontent.com/llmware-ai/llmware/main/docker-compose-qdrant.yaml
docker compose up -d  
```  

3. postgres and pgvector  

```bash
curl -o docker-compose.yaml https://raw.githubusercontent.com/llmware-ai/llmware/main/docker-compose-pgvector.yaml
docker compose up -d  
```  

4.  redis
```bash
# scripts to deploy other options
curl -o docker-compose.yaml https://raw.githubusercontent.com/llmware-ai/llmware/main/docker-compose-redis-stack.yaml
```

5.  neo4j

```bash
curl -o docker-compose.yaml https://raw.githubusercontent.com/llmware-ai/llmware/main/docker-compose-neo4j.yaml
docker compose up -d  
```  

# Configure Vector DB  

To configure a vector database in llmware, we provide configuration objects in the `configs.py` module to adjust 
authentication, port/host information, and other common configurations.   To use the configuration, the pattern is 
as follows through simple `get_config` and `set_config` methods:    

```python
from llmware.configs import MilvusConfig
MilvusConfig().set_config("lite", True)

from llmware.configs import ChromaDBConfig
current_config = ChromaDBConfig().get_config("persistent_path")  
ChromaDBConfig().set_config("persistent_path", "/new/local/path")
```

Configuration objects are provided for the following vector DB:  `MilvusConfig`, `ChromaDBConfig`, `QdrantConfig`, 
`Neo4jConfig`, `LanceDBConfig`, `PineConeConfig`, `MongoConfig`, `PostgresConfig`.  

For 'out-of-the-box' testing and development, for most use cases, you will not need to change these configs.  

Need help or have questions?
============================

Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware).

Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions).


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
  <li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Whisper CPP
parent: Components
nav_order: 14
description: overview of the major modules and classes of LLMWare  
permalink: /components/whisper_cpp
---
# Whisper CPP
---

llmware has an integrated WhisperCPP backend which enables fast, easy local voice-to-text processing. 

Whisper is a leading open voice voice-to-text model from OpenAI - https://github.com/openai/whisper

WhisperCPP is the implementation of Whisper packaged as a GGML deliverable - https://github.com/ggerganov/whisper.cpp

Starting with llmware 0.2.11, we have integrated WhisperCPPModel as a new model class, 
providing options for direct inference, and coming soon, integration into the Parser for easy text chunking and
parsing into a Library with other document types.

llmware provides prebuilt shared libraries for WhisperCPP on the following platforms:
   --Mac M series
   --Linux x86 (no CUDA)
   --Linux x86 (with CUDA) - really fast
   --Windows x86 (only on CPU) currently.

We have added three Whisper models to the default model catalog:

1.  ggml-base.en.bin - english-only base model
2.  ggml-base.bin - multi-lingual base model
3.  ggml-small.en-tdrz.bin - this is a 'tiny-diarize' implementation that has been finetuned to identify the 
speakers and inserts special [_SOLM_] tags to indicate a conversation turn / change of speaker.
               
    Main repo:  https://github.com/akashmjn/tinydiarize/
    Citation:   @software{mahajan2023tinydiarize,
                author = {Mahajan, Akash}, month = {08},
                title = {tinydiarize: Minimal extension of Whisper for speaker segmentation with special tokens}
                url = {https://github.com/akashmjn/tinydiarize},
                year = {2023}

To use WAV files, there is one additional Python dependency required:
           --pip install librosa
           --Note: this has been added to the default requirements.txt and pypy build starting with 0.2.11

To use other popular audio/video file formats, such as MP3, MP4, M4A, etc., then the following dependencies are
required:
   --pip install pydub
   --ffmpeg library - which can be installed as follows:
       --  Linux:  `sudo apt install ffmpeg'
       --  Mac:    `brew install ffmpeg`
       -- Windows:  direct download and install from ffmpeg


```python

""" This example shows how to use llmware provided sample files for testing with WhisperCPP, integrated as of
    llmware 0.2.11.

    # examples - "famous_quotes" | "greatest_speeches" | "youtube_demos" | "earnings_calls"

        -- famous_quotes - approximately 20 small .wav files with clips from old movies and speeches
        -- greatest_speeches - approximately 60 famous historical speeches in english
        -- youtube_videos - wav files of ~3 llmware youtube videos
        -- earnings_calls - wav files of ~4 public company earnings calls (gathered from public investor relations)

    These sample files are hosted in a non-restricted AWS S3 bucket, and downloaded via the Setup method
    `load_sample_voice_files`.   There are two options:

        --  small_only = True:      only pulls the 'famous_quotes' samples
        --  small_only = False:     pulls all of the samples    (requires ~1.9 GB in total)

    Please note that all of these samples have been pulled from open public domain sources, including the
    Internet Archives, e.g., https://archive.org.  These sample files are being provided solely for the purpose of
    testing the code scripts below.   Please do not use them for any other purpose.

    To run these examples, please make sure to `pip install librosa`
    """

import os
from llmware.models import ModelCatalog
from llmware.gguf_configs import GGUFConfigs
from llmware.setup import Setup

#   optional / to adjust various parameters of the model
GGUFConfigs().set_config("whisper_cpp_verbose", "OFF")
GGUFConfigs().set_config("whisper_cpp_realtime_display", True)

#   note: english is default output - change to 'es' | 'fr' | 'de' | 'it' ...
GGUFConfigs().set_config("whisper_language", "en")
GGUFConfigs().set_config("whisper_remove_segment_markers", True)


def sample_files(example="famous_quotes", small_only=False):

    """ Execute a basic inference on Voice-to-Text model passing a file_path string """

    voice_samples = Setup().load_voice_sample_files(small_only=small_only)

    examples = ["famous_quotes", "greatest_speeches", "youtube_demos", "earnings_calls"]

    if example not in examples:
        print("choose one of the following - ", examples)
        return 0

    fp = os.path.join(voice_samples,example)

    files = os.listdir(fp)

    #   these are the two key lines
    whisper_base_english = "whisper-cpp-base-english"

    model = ModelCatalog().load_model(whisper_base_english)

    for f in files:

        if f.endswith(".wav"):

            prompt = os.path.join(fp,f)

            print(f"\n\nPROCESSING: prompt = {prompt}")

            response = model.inference(prompt)

            print("\nllm response: ", response["llm_response"])
            print("usage: ", response["usage"])

    return 0


if __name__ == "__main__":

    # pick among the four examples: famous_quotes | greatest_speeches | youtube_demos | earnings_calls

    sample_files(example="famous_quotes", small_only=False)
```


Need help or have questions?
============================

Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware).

Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions).


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
  <li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Code contributions
parent: Contributing
nav_order: 1
permalink: /contributing/code
---
# Contributing code
One way to contribute to ``llmware`` is by contributing to the code base.

We briefly describe some of the important modules of ``llmware`` next, so you can more easily navigate the code base.
You may also take a look at our [fast start series from YouTube](https://www.youtube.com/playlist?list=PL1-dn33KwsmD7SB9iSO6vx4ZLRAWea1DB).

## Core modules

### Library
<iframe width="560" height="315" src="https://www.youtube.com/embed/2xDefZ4oBOM?si=IAHkxpQkFwnWyYTL" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
The *library* module implements the classes **Library** and **LibraryCatalog**.
The **Library** class implements the *library* concept.
A *library* is a collection of documents, where a document can be PDF, an image, or an office document.
It is responsible for parsing, text chunking, and indexing.
In other words, it does the heavy lifting of adding content.
In the following, we shortly describe the functions for adding documents to the library.

```python
add_file(
    self,
    file_path):
```
This method adds one document of any supported type to the library.

```python
add_files(
    self,
    input_folder_path=None,
    encoding="utf-8",
    chunk_size=400,
    get_images=True,get_tables=True,
    smart_chunking=2,
    max_chunk_size=600,
    table_grid=True,
    get_header_text=True,
    table_strategy=1,
    strip_header=False,
    verbose_level=2,
    copy_files_to_library=True):
```
This method adds the documents of one folder to the library.

```python
add_website(
    self,
    url,
    get_links=True,
    max_links=5):
```
This method adds a website, and links from the website, to the library.

```python
add_wiki(
    self,
    topic_list,
    target_results=10):
```
This method adds a wikipedia article to the library.

```python
add_dialogs(
    self,
    input_folder=None):
```
This method adds an AWS dialog transcript to the library.

```python
add_image(
    self,
    input_folder=None):
```
This method adds images to the libary.

```python
add_pdf_by_ocr(
    self,
    input_folder=None):
```
This method adds scanned PDFs to the library.

```python
add_pdf(
    self,
    input_folder=None):
```
This method adds PDFs to the library.

```python
add_office(
    self,
    input_folder=None):
```
This method adds office documents to the library.

### Embeddings
<iframe width="560" height="315" src="https://www.youtube.com/embed/xQEk6ohvfV0?si=GAPle5gVdVPkYKWv" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
An *embedding* is a vector store and an embedding model.
It is responsible for applying an embedding model to a library, storing the embeddings in a vector store, and providing access to the embeddings with natural language queries.
We briefly describe the common methods offered by all vector stores below.

```python
def create_new_embedding(
    self,
    doc_ids=None,
    batch_size=500):
```
This method creates the embeddings and adds them to the vector store.

```python
def search_index(
    self,
    query_embedding_vector,
    sample_count=10):
```
This method searches the vector store given the query vector.

```python
def delete_index(self):
```
This method deletes the created vector store index.


### Prompts
<iframe width="560" height="315" src="https://www.youtube.com/embed/swiu4oBVfbA?si=rKbgO3USADCqICqr" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
A *prompt* is an input to model.
The prompt is used by the model to generate the response.
One important use case is that users want to augment a prompt, or a series of prompts, with additional information.
Next, we describe methods for augmenting a prompt with additional information.

```python
def add_source_new_query(
    self,
    library,
    query=None,
    query_type="semantic",
    result_count=10):
```
This method adds the results of the ``query`` to the prompt.

```python
def add_source_query_results(
    self,
    query_results):
```
This method adds previous results from a query as a source to the prompt.

```python
def add_source_library(
    self,
    library_name):
```
This method adds an entire library to the prompt.
We recommend that you only use this when the library is sufficiently small.

```python
def add_source_wikipedia(
    self,
    topic,
    article_count=3,
    query=None):
```
This method adds wikipedia articles to the prompt based on the provided ``topic``.

```python
def add_source_yahoo_finance(
    self,
    ticker=None,
    key_list=None):
```
This method adds a Yahoo finance ticker to the prompt.

```python
def add_source_knowledge_graph(
    self,
    library,
    query):
```
This method adds the summary output elements from a knowledge graph based on the provided ``query``.
Please note that this method is experimental, i.e. unstable, and is subject to change dramatically in each new version.

```python
def add_source_website(
    self,
    url,
    query=None):
```
This method adds the website pointed to by the ``url`` to the prompt.

```python
def add_source_document(
    self,
    input_fp,
    input_fn,
    query=None):
```
This method adds a document, or documents, of any supported type to the prompt.
If documents are added, then the ``query`` parameter can be used to filter the documents.

```python
def add_source_last_interaction_step(
    self):
```
This method adds the last interaction to the prompt.
The use case for this is to enable interactive dialog, i.e. chatting.

### Model Catalog
A *model catalog* is a collection of models.
In the following, we briefly describe the methods for adding new models to the catalog.

```python
def register_new_hf_generative_model(
    self,
    hf_model_name=None,
    context_window=2048,
    prompt_wrapper="<INST>",
    display_name=None,
    temperature=0.3,
    trailing_space="",
    link=""):
```
This method adds a new generative model from hugging face.
Users can therefore add models from hugging face that are unsupported currently.

```python
def register_sentence_transformer_model(
    self,
    model_name,
    embedding_dims,
    context_window,
    display_name=None,
    link=""):
```
This method adds a new sentence transformer.

```python
def register_gguf_model(
    self,
    model_name,
    gguf_model_repo,
    gguf_model_file_name,
    prompt_wrapper=None,
    eos_token_id=0,
    display_name=None,
    trailing_space="",
    temperature=0.3,
    context_window=2048,
    instruction_following=True):
```
This method adds a new GGUF model.

```python
def register_open_chat_model(
    cls,
    model_name,
    api_base=None,
    model_type="chat",
    display_name=None,
    context_window=4096,
    instruction_following=True,
    prompt_wrapper="",
    temperature=0.5):
```
This method adds any chat model that is available through a web API, e.g. a chat model that is available locally
via localhost.

```python
def register_ollama_model(
    cls,
    model_name,
    host="localhost",
    port=11434,
    model_type="chat",
    raw=False,
    stream=False,
    display_name=None,
    context_window=4096,
    instruction_following=True,
    prompt_wrapper="",
    temperature=0.5):
```
This method adds an OLLama model that is available through a web API.
The method is similar to the ``register_open_chat_model`` method above.

### Categories of code contributions

#### New or Enhancement to existing Features
You want to submit a code contribution that adds a new feature or enhances an existing one?
Then the best way to start is by opening a discussion in our [GitHub discussions](https://github.com/llmware-ai/llmware/discussions).
Please do this before you work on it, so you do not put effort into it just to realise after submission that
it will not be merged.

#### Bugs
If you encounter a bug, you can

- File an issue about the bug.
- Provide a self-contained minimal example that reproduces the bug, which is extremely important.
- Provide possible solutions for the bug.
- Submit a pull a request to fix the bug.

We encourage you to read [How to create a Minimal, Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example) from the Stackoverflow helpcenter, and the tag description of [self-container](https://stackoverflow.com/tags/self-contained/info), also from Stackoverflow.

---

---
layout: default
title: Contributing
nav_order: 7
has_children: true
description: llmware contributions.
permalink: /contributing
---

# Contributing to llmware

{: .note}
> The contributions to `llmware` are governed by our [Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md).

{: .warning}
> Have you found a security issue? Then please jump to [Security Vulnerabilities](#security-vulnerabilities).

On this page, we provide information ``llmware`` contributions.
There are **two ways** on how you can contribute.
The first is by making **code contributions**, and the second by making contributions to the **documentation**.
Please look at our [contribution suggestions](#how-can-you-contribute) if you need inspiration, or take a look at [open issues](#open-issues).

Contributions to `llmware` are welcome from everyone.
Our goal is to make the process simple, transparent, and straightforward.
We are happy to receive suggestions on how the process can be improved.

## How can you contribute?

{: .note}
> If you have never contributed before look for issues with the tag [``good first issue``](https://github.com/llmware-ai/llmware/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22).

The most usual ways to contribute is to add new features, fix bugs, add tests, or add documentation.
You can visit the [issues](https://github.com/llmware-ai/llmware/issues) site of the project and search for tags such as
``bug``, ``enhancement``, ``documentation``, or ``test``.


Here is a non exhaustive list of contributions you can make.

1. Code refactoring
2. Add new text data bases 
3. Add new vector data bases 
4. Fix bugs
5. Add usage examples (see for example the issues [jupyter notebook - more examples and better support](https://github.com/llmware-ai/llmware/issues/508) and [google colab examples and start up scripts](https://github.com/llmware-ai/llmware/issues/507))
6. Add experimental features
7. Improve code quality
8. Improve documentation in the docs (what you are reading right now)
9. Improve documentation by adding or updating docstrings in modules, classes, methods, or functions (see for example [Add docstrings](https://github.com/llmware-ai/llmware/issues/219))
10. Improve test coverage
11. Answer questions in our [Discord channel](https://discord.gg/MhZn5Nc39h), especially in the [technical support forum](https://discord.com/channels/1179245642770559067/1218498778915672194)
12. Post projects in which you use ``llmware`` in our Discord forum [made with llmware](https://discord.com/channels/1179245642770559067/1218567269471486012), ideially with a link to a public GitHub repository

## Open Issues
If you're interested in existing issues, you can

- Look for issues, if you are a new to the project, look for issues with the `good first issue` label.
- Provide answers for questions in our [GitHub discussions](https://github.com/llmware-ai/llmware/discussions)
- Provide help for bug or enhancement issues. 
  - Ask questions, reproduce the issues, or provide solutions.
  - Pull a request to fix the issue.

 
## Security Vulnerabilities
**If you believe you've found a security vulnerability, then please _do not_ submit an issue ticket or pull request or otherwise publicly disclose the issue.**
Please follow the process at [Reporting a Vulnerability](https://github.com/llmware-ai/llmware/blob/main/Security.md)


## GitHub workflow

We follow the [``fork-and-pull``](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request-from-a-fork) Git workflow.

1.  [Fork](https://docs.github.com/en/github/getting-started-with-github/fork-a-repo) the repository on GitHub.
2. Clone your fork to your local machine with `git clone git@github.com:<yourname>/llmware.git`.
3. Create a branch with `git checkout -b my-topic-branch`.
4. Run the test suite by navigating to the tests/ folder and running ```./run-tests.py -s``` to ensure there are no failures
5. [Commit](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/committing-changes-to-a-pull-request-branch-created-from-a-fork) changes to your own branch, then push to GitHub with `git push origin my-topic-branch`.
6. Submit a [pull request](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests) so that we can review your changes.

Remember to [synchronize your forked repository](https://docs.github.com/en/github/getting-started-with-github/fork-a-repo#keep-your-fork-synced) _before_ submitting proposed changes upstream. If you have an existing local repository, please update it before you start, to minimize the chance of merge conflicts.

```shell
git remote add upstream git@github.com:llmware-ai/llmware.git
git fetch upstream
git checkout upstream/main -b my-topic-branch
```

## Community
Questions and discussions are welcome in any shape or form.
Please fell free to join our community on our discord channel, on which we are active daily.
You are also welcome if you just want to post an idea!

- [Discord Channel](https://discord.gg/MhZn5Nc39h)
- [GitHub discussions](https://github.com/llmware-ai/llmware/discussions)

---

---
layout: default
title: Documentation contributions
parent: Contributing
nav_order: 2
permalink: contributing/documentation
---
# Contributing documentation
One way to contribute to ``llmware`` is by contributing documentation.

There are **two ways** to contribute to the ``llmware`` documentation.
The first is via **docstrings in the code**, and the second is **the docs**, which is what you are *currently reading*.
In both areas, you can contribute in a lot of ways.
Here is a non exhaustive list of these ways for the docstrings which also apply to the docs.

1. Add documentation (e.g., adding a docstring to a function)
2. Update documentation (e.g., update a docstring that is not in sync with the code)
3. Simplify documentation (e.g., formulate a docstring more clearly)
4. Enhance documentation (e.g., add more examples to a docstring or fix typos)

## Docstrings
**Docstrings** document the code within the code, which allows programmers to easily have a look while they are programming.
For an exmaple, have a look at [this docstring](https://github.com/llmware-ai/llmware/blob/c9e12a7a150162986622738e127c37ac70f31cd6/llmware/agents.py#L27-L66) which documents the ``LLMfx`` class.

We follow the docstring style of **numpy**, for which you can find an example [here](https://github.com/numpy/numpydoc/blob/main/doc/example.py) and [here](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_numpy.html).
Please be sure to follow the conventions and go over your pull request before you submit it.


## Docs

{: .note}
> All commands are executed from the `docs` sub-directory.

Contributing to this documentation is extremely important as many users will refer to it.

If you plan to contribute to the docs, we recommend that you locally install `jekyll` so you can test your changes locally.
We also recommend, that you install `jekyll` into a a ruby enviroment so it does not interfere with any other installations you might have.

We recommend that you install `rbenv` and `rvm` to manage your ruby installation.
`rbenv` is a tool that mangages different ruby versions, similar to what `conda` does for `python`.
Please [install rbenv](https://github.com/rbenv/rbenv?tab=readme-ov-file#installation) following their instructions, and the same for [install rvm](https://github.com/rvm/rvm?tab=readme-ov-file#installing-rvm).
We recommend that you install a ruby version `>=3.0`.
After having installed an isolated ruby version, you have to install the dependencies to build the docs locally.
The `docs` directory has a `Gemfile` which specifies the dependencies.
You can hence simply navigate to it and use the `bundle install` command.

```bash
bundle install
```

You should now be able to build and serve the documentation locally.
To do this, simply to the following.
```bash
bundle exec jekyll server --livereload --verbose
```
In the browser of your choice, you can then go to `http://127.0.0.1:4000/` and you will be served the documentation, which is re-build and re-loaded after any change to the `docs`.
``jekyll`` will create a ``_site`` directory where it saves the created files, please **never commit any files from the \_site directory**!

## Open Issues
If you're interested in existing issues, you can

- Look for issues with the `good first issue` and `documentation` label as a good place to get started.
- Provide answers for questions in our [GitHub discussions](https://github.com/llmware-ai/llmware/discussions)
- Provide help for bug or enhancement issues. 
  - Ask questions, reproduce the issues, or provide solutions.
  - Pull a request to fix the issue.

---

---
layout: default
title: Agents
parent: Examples
nav_order: 2
description: overview of the major modules and classes of LLMWare  
permalink: /examples/agents
---
# Agents


 🚀 Start Building Multi-Model Agents Locally on a Laptop 🚀  
===============

**What is a SLIM?**    

**SLIMs** are **S**tructured **L**anguage **I**nstruction **M**odels, which are small, specialized 1-3B parameter LLMs, 
finetuned to generate structured outputs (Python dictionaries and lists, JSON and SQL) that can be handled programmatically, and 
stacked together in multi-step, multi-model Agent workflows - all running on a local CPU.  

**New SLIMS Just released** - check out slim-extract, slim-summarize, slim-xsum, slim-sa-ner, slim-boolean and slim-tags-3b  

**Check out the new examples below marked with ⭐**  
🔥🔥🔥 Web Services & Function Calls ([code](web_services_slim_fx.py)) 🔥🔥🔥  

**Check out the Intro videos**  
[SLIM Intro Video](https://www.youtube.com/watch?v=cQfdaTcmBpY)  

There are 16 SLIM models, each delivered in two packages - a Pytorch/Huggingface FP16 model, and a  
quantized "tool" designed for fast inference on a CPU, using LLMWare's embedded GGUF inference engine.  In most cases, 
we would recommend that you start with the "tools" version of the models.

**Getting Started**

We have several ready-to-run examples in this repository:  

| Example                                                                                                                                             | Detail                                                                       |
|-----------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------|
| 1.   Getting Started with SLIM Models ([code](slims-getting-started.py) / [video](https://www.youtube.com/watch?v=aWZFrTDmMPc&t=196s)) | Install the models and run hello world tests to see the models in action.    |
| 2.   Getting Started with Function-Calling Agent ([code](agent-llmfx-getting-started.py) / [video](https://www.youtube.com/watch?v=cQfdaTcmBpY)) | Generate a Structured Report with LLMfx  |   
| 3.   Multi-step Complex Analysis with Agent ([code](agent-multistep-analysis.py) / [video](https://www.youtube.com/watch?v=y4WvwHqRR60))                                                       | Delivering Complex Research Analysis with SLIM Agents                        |                                                                                                                               |  
| 4.   Document Clustering ([code](document-clustering.py))                    | Multi-faceted automated document analysis with Topics, Tags and NER          |  
| 5.   Two-Step NER Retrieval ([code](ner-retrieval.py))                          | Using NER to extract name, and then using as basis for retrieval.            |                                                                                                                                        | 
| 6.   Using Sentiment Analysis ([code](sentiment-analysis.py)) | Using sentiment analysis on earnings transcripts and a 'if...then' condition |
| 7.   Text2SQL - Intro ([code](text2sql-getting-started-1.py))                                                                             | Getting Started with SLIM-SQL-TOOL and Basic Text2SQL Inference              |                                                                                                                   |
| 8.   Text2SQL - E2E ([code](text2sql-end-to-end-2.py))                                                                                  | End-to-End Natural Langugage Query to SQL DB Query                           |                                                                                                                     |  
| 9.   Text2SQL - MultiStep ([code](text2sql-multistep-example-3.py))                                                                     | Extract a customer name using NER and use in a Text2SQL query                |  
| 10.  ⭐ Web Services & Function Calls ([code](web_services_slim_fx.py)) | Generate 30 key financial analysis with SLIM function calls and web services |  
| 11.  ⭐ Yes-No Questions with Explanations ([code](using_slim_boolean_model.py)) | Analyze earnings releases with SLIM Boolean |  
| 12.  ⭐ Extracting Revenue Growth ([code](using_slim_extract_model.py)) | Extract revenue growth from earnings releases with SLIM Extract |  
| 13.  ⭐ Summary as a Function Call ([code](using_slim_summary.py)) | Simple Summarization as a Function Call with List Length Parameters |  
| 14.  ⭐ Handling Not Found Extracts ([code](not_found_extract_with_lookup.py)) | Multi-step Lookup strategy and handling not-found answers | 
| 15.  ⭐ Extract + Lookup ([code](custom_extract_and_lookup.py)) | Extract Named Entity information and use for lookups with SLIM Extract |  
| 16.  ⭐ Headline/Title as XSUM Function Call ([code](using_slim_xsum.py)) | eXtreme Summarization (XSUM) with SLIM XSUM |  

For information on all of the SLIM models, check out [LLMWare SLIM Model Collection](https://www.huggingface.co/llmware/).  

**Models List**  
If you would like more information about any of the SLIM models, please check out their model card:  

- extract - extract custom keys - [slim-extract](https://www.huggingface.co/llmware/slim-extract) & [slim-extract-tool](https://www.huggingface.co/llmware/slim-extract-tool)
- summary - summarize function call - [slim-summary](https://www.huggingface.co/llmware/slim-summary) & [slim-summary-tool](https://www.huggingface.co/llmware/slim-summary-tool)
- xsum - title/headline function call - [slim-xsum](https://www.huggingface.co/llmware/slim-xsum) & [slim-xsum-tool](https://www.huggingface.co/llmware/slim-xsum-tool)  
- ner - extract named entities  - [slim-ner](https://www.huggingface.co/llmware/slim-ner) & [slim-ner-tool](https://www.huggingface.co/llmware/slim-ner-tool)
- sentiment - evaluate sentiment - [slim-sentiment](https://www.huggingface.co/slim-sentiment) & [slim-sentiment-tool](https://www.huggingface.co/llmware/slim-sentiment-tool)    
- topics - generate topic - [slim-topics](https://www.huggingface.co/slim-topics) & [slim-topics-tool](https://www.huggingface.co/llmware/slim-topics-tool)
- sa-ner - combo model (sentiment + named entities) - [slim-sa-ner](https://www.huggingface.co/slim-sa-ner) & [slim-sa-ner-tool](https://www.huggingface.co/llmware/slim-sa-ner-tool)
- boolean - provides a yes/no output with explanation - [slim-boolean](https://www.huggingface.co/slim-boolean) & [slim-boolean-tool](https://www.huggingface.com/llmware/slim-boolean-tool)  
- ratings - apply 1 (low) - 5 (high) rating - [slim-ratings](https://www.huggingface.co/slim-ratings) & [slim-ratings-tool](https://www.huggingface.co/llmware/slim-ratings-tool)  
- emotions - assess emotions - [slim-emotions](https://www.huggingface.co/slim-emotions) & [slim-emotions-tool](https://www.huggingface.co/llmware/slim-emotions-tool)  
- tags - auto-generate list of tags - [slim-tags](https://www.huggingface.co/slim-tags) & [slim-tags-tool](https://www.huggingface.co/llmware/slim-tags-tool)
- tags-3b - enhanced auto-generation tagging model - [slim-tags-3b](https://www.huggingface.com/slim-tags-3b) & [slim-tags-3b-tool](https://www.huggingface.co/llmware/slim-tags-3b-tool)  
- intent - identify intent - [slim-intent](https://www.huggingface.co/slim-intent) & [slim-intent-tool](https://www.huggingface.co/llmware/slim-intent-tool)  
- category - high-level category - [slim-category](https://www.huggingface.co/slim-category) & [slim-category-tool](https://wwww.huggingface.co/llmware/slim-category-tool)
- nli - assess if evidence supports conclusion - [slim-nli](https://www.huggingface.co/slim-nli) & [slim-nli-tool](https://www.huggingface.co/llmware/slim-nli-tool)  
- sql - convert text into sql - [slim-sql](https://www.huggingface.co/slim-sql) & [slim-sql-tool](https://www.huggingface.co/llmware/slim-sql-tool)  

You may also want to check out these quantized 'answer' tools, which work well in conjunction with SLIMs for question-answer and summarization:  
- bling-stablelm-3b-tool - 3b quantized RAG model - [bling-stablelm-3b-gguf](https://www.huggingface.co/llmware/bling-stablelm-3b-gguf)  
- bling-answer-tool - 1b quantized RAG model - [bling-answer-tool](https://www.huggingface.co/llmware/bling-answer-tool)  
- dragon-yi-answer-tool - 6b quantized RAG model - [dragon-yi-answer-tool](https://www.huggingface.co/llmware/dragon-yi-answer-tool)  
- dragon-mistral-answer-tool - 7b quantized RAG model - [dragon-mistral-answer-tool](https://www.huggingface.co/llmware/dragon-mistral-answer-tool)  
- dragon-llama-answer-tool - 7b quantized RAG model - [dragon-llama-answer-tool](https://www.huggingface.co/llmware/dragon-llama-answer-tool)  


**Set up**  
No special setup for SLIMs is required other than to install llmware >=0.2.6, e.g., `pip3 install llmware`.  

**Platforms:**   
- Mac M1, Mac x86, Windows, Linux (Ubuntu 22 preferred, supported on Ubuntu 20 +)  
- RAM: 16 GB minimum 
- Python 3.9, 3.10, 3.11 (note: not supported on 3.12 yet)
- llmware >= 0.2.6 version
  

### **Let's get started!  🚀**

---

---
layout: default
title: Datasets
parent: Examples
nav_order: 10
description: overview of the major modules and classes of LLMWare  
permalink: /examples/datasets
---
# Datasets - Introduction by Examples

llmware provides powerful capabilities to transform raw unstructured information into various model-ready datasets.  

```python

import os
import json

from llmware.library import Library
from llmware.setup import Setup
from llmware.dataset_tools import Datasets
from llmware.retrieval import Query

def build_and_use_dataset(library_name):

    # Setup a library and build a knowledge graph.  Datasets will use the data in the knowledge graph
    print (f"\n > Creating library {library_name}...")
    library = Library().create_new_library(library_name)
    sample_files_path = Setup().load_sample_files()
    library.add_files(os.path.join(sample_files_path,"SmallLibrary"))
    library.generate_knowledge_graph()

    # Create a Datasets object from library
    datasets = Datasets(library)

    # Build a basic dataset useful for industry domain adaptation for fine-tuning embedding models
    print (f"\n > Building basic text dataset...")

    basic_embedding_dataset = datasets.build_text_ds(min_tokens=500, max_tokens=1000)
    dataset_location = os.path.join(library.dataset_path, basic_embedding_dataset["ds_id"])

    print (f"\n > Dataset:")
    print (f"(Files referenced below are found in {dataset_location})")

    print (f"\n{json.dumps(basic_embedding_dataset, indent=2)}")
    sample = datasets.get_dataset_sample(datasets.current_ds_name)

    print (f"\nRandom sample from the dataset:\n{json.dumps(sample, indent=2)}")
    
    # Other Dataset Generation and Usage Examples:

    # Build a simple self-supervised generative dataset- extracts text and splits into 'text' & 'completion'
    # Several generative "prompt_wrappers" are available - chat_gpt | alpaca | 
    basic_generative_completion_dataset = datasets.build_gen_ds_targeted_text_completion(prompt_wrapper="alpaca")
    
    # Build a generative self-supervised training sets created by pairing 'header_text' with 'text'
    xsum_generative_completion_dataset = datasets.build_gen_ds_headline_text_xsum(prompt_wrapper="human_bot")
    topic_prompter_dataset = datasets.build_gen_ds_headline_topic_prompter(prompt_wrapper="chat_gpt")
    
    # Filter a library by a key term as part of building the dataset
    filtered_dataset = datasets.build_text_ds(query="agreement", filter_dict={"master_index":1})
    
    # Pass a set of query results to create a dataset from those results only
    query_results = Query(library=library).query("africa")
    query_filtered_dataset = datasets.build_text_ds(min_tokens=250,max_tokens=600, qr=query_results)

    return 0
```

For more examples, see the [datasets example]((https://www.github.com/llmware-ai/llmware/tree/main/examples/Datasets/) in the main repo.   


Check back often - we are updating these examples regularly - and many of these examples have companion videos as well.  


# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://huggingface.co/llmware"><span><img src="assets/images/hf-logo.svg" alt="Hugging Face" class="hugging-face-logo"/></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Embedding
parent: Examples
nav_order: 5
description: overview of the major modules and classes of LLMWare  
permalink: /examples/embedding
---
# Embedding - Introduction by Examples
We introduce ``llmware`` through self-contained examples.

```python

""" This example is a fast start with Milvus Lite, which is a 'no-install' file-based version of Milvus, intended
for rapid prototyping.   A couple of key points to note:

    -- Platform - per Milvus docs, Milvus Lite is designed for Mac and Linux (not on Windows currently)
    -- PyMilvus - need to `pip install pymilvus>=2.4.2`
    -- within LLMWare:  set MilvusConfig().set_config("lite", True)
"""

import os
from llmware.library import Library
from llmware.retrieval import Query
from llmware.setup import Setup
from llmware.status import Status
from llmware.models import ModelCatalog
from llmware.configs import LLMWareConfig, MilvusConfig

from importlib import util

if not util.find_spec("pymilvus"):
    print("\nto run this example with pymilvus, you need to install pymilvus:  pip3 install pymilvus>=2.4.2")


def setup_library(library_name):

    """ Note: this setup_library method is provided to enable a self-contained example to create a test library """

    #   Step 1 - Create library which is the main 'organizing construct' in llmware
    print ("\nupdate: Creating library: {}".format(library_name))

    library = Library().create_new_library(library_name)

    #   check the embedding status 'before' installing the embedding
    embedding_record = library.get_embedding_status()
    print("embedding record - before embedding ", embedding_record)

    #   Step 2 - Pull down the sample files from S3 through the .load_sample_files() command
    #   --note: if you need to refresh the sample files, set 'over_write=True'
    print ("update: Downloading Sample Files")

    sample_files_path = Setup().load_sample_files(over_write=False)

    #   Step 3 - point ".add_files" method to the folder of documents that was just created
    #   this method parses the documents, text chunks, and captures in database

    print("update: Parsing and Text Indexing Files")

    library.add_files(input_folder_path=os.path.join(sample_files_path, "Agreements"),
                      chunk_size=400, max_chunk_size=600, smart_chunking=1)

    return library


def install_vector_embeddings(library, embedding_model_name):

    """ This method is the core example of installing an embedding on a library.
        -- two inputs - (1) a pre-created library object and (2) the name of an embedding model """

    library_name = library.library_name
    vector_db = LLMWareConfig().get_vector_db()

    print(f"\nupdate: Starting the Embedding: "
          f"library - {library_name} - "
          f"vector_db - {vector_db} - "
          f"model - {embedding_model_name}")

    #   *** this is the one key line of code to create the embedding ***
    library.install_new_embedding(embedding_model_name=embedding_model, vector_db=vector_db,batch_size=100)

    #   note: for using llmware as part of a larger application, you can check the real-time status by polling Status()
    #   --both the EmbeddingHandler and Parsers write to Status() at intervals while processing
    update = Status().get_embedding_status(library_name, embedding_model)
    print("update: Embeddings Complete - Status() check at end of embedding - ", update)

    # Start using the new vector embeddings with Query
    sample_query = "incentive compensation"
    print("\n\nupdate: Run a sample semantic/vector query: {}".format(sample_query))

    #   queries are constructed by creating a Query object, and passing a library as input
    query_results = Query(library).semantic_query(sample_query, result_count=20)

    for i, entries in enumerate(query_results):

        #   each query result is a dictionary with many useful keys

        text = entries["text"]
        document_source = entries["file_source"]
        page_num = entries["page_num"]
        vector_distance = entries["distance"]

        #   to see all of the dictionary keys returned, uncomment the line below
        #   print("update: query_results - all - ", i, entries)

        #  for display purposes only, we will only show the first 125 characters of the text
        if len(text) > 125:  text = text[0:125] + " ... "

        print("\nupdate: query results - {} - document - {} - page num - {} - distance - {} "
              .format( i, document_source, page_num, vector_distance))

        print("update: text sample - ", text)

    #   lets take a look at the library embedding status again at the end to confirm embeddings were created
    embedding_record = library.get_embedding_status()

    print("\nupdate:  embedding record - ", embedding_record)

    return 0


if __name__ == "__main__":

    #   Fast Start configuration - will use no-install embedded sqlite
    #   -- if you have installed Mongo or Postgres, then change the .set_active_db accordingly

    LLMWareConfig().set_active_db("sqlite")

    #   set the "lite" flag in MilvusConfig to True -> to use server version, set to False (which is default)
    MilvusConfig().set_config("lite", True)
    LLMWareConfig().set_vector_db("milvus")

    #   Step 1 - create library
    library = setup_library("ex2_milvus_lite")

    #   Step 2 - Select any embedding model in the LLMWare catalog

    #   to see a list of the embedding models supported, uncomment the line below and print the list
    embedding_models = ModelCatalog().list_embedding_models()

    #   for i, models in enumerate(embedding_models):
    #       print("embedding models: ", i, models)

    #   for this first embedding, we will use a very popular and fast sentence transformer
    embedding_model = "mini-lm-sbert"

    #   note: if you want to swap out "mini-lm-sbert" for Open AI 'text-embedding-ada-002', uncomment these lines:
    #   embedding_model = "text-embedding-ada-002"
    #   os.environ["USER_MANAGED_OPENAI_API_KEY"] = "<insert-your-openai-api-key>"

    #   run the core script
    install_vector_embeddings(library, embedding_model)
```


For more examples, see the [embedding examples]((https://www.github.com/llmware-ai/llmware/tree/main/examples/Embedding/) in the main repo.   


Check back often - we are updating these examples regularly - and many of these examples have companion videos as well.  


# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://huggingface.co/llmware"><span><img src="assets/images/hf-logo.svg" alt="Hugging Face" class="hugging-face-logo"/></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Examples
nav_order: 5
has_children: true
description: examples, recipes and use cases
permalink: /examples
---

llmware offers a wide range of examples to cover the lifecycle of building RAG and Agent based applications using 
small language models:

   - [Parsing examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing) - ~14 stand-alone parsing examples for all common document types, including options for parsing in memory, outputting to JSON, parsing custom configured CSV and JSON files, running OCR on embedded images found in documents, table extraction, image extraction, text chunking, zip files, and web sources.  
   - [Embedding examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Embedding) - ~15 stand-alone embedding examples to show how to use ~10 different vector databases and wide range of leading open source embedding models (including sentence transformers).  
   - [Retrieval examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Retrieval) - ~10 stand-alone examples illustrating different query and retrieval techniques - semantic queries, text queries, document filters, page filters, 'hybrid' queries, author search, using query state, and generating bibliographies.  
   - [Dataset examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Datasets) - ~5 stand-alone examples to show 'next steps' of how to leverage a Library to re-package content into various datasets and automated NLP analytics.  
   - [Fast start example #1-Parsing](https://github.com/llmware-ai/llmware/blob/main/fast_start/rag/example-1-create_first_library.py) - shows the basics of parsing.  
   - [Fast start example #2-Embedding](https://github.com/llmware-ai/llmware/blob/main/fast_start/rag/example-2-build_embeddings.py) - shows the basics of building embeddings.  
   - [CustomTable examples](https://github.com/llmware-ai/llmware/tree/main/examples/Structured_Tables) - ~5 examples to start building structured tables that can be used in conjunction with LLM-based workflows.  

   - [Models examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Models) - ~20 examples showing a wide range of different model inferences and use cases, including the ability to integrate Ollama models, OpenChat (e.g., LMStudio) models, using LLama-3 and Phi-3, bringing your own models into the ModelCatalog, and configuring sampling settings.  
   - [Prompts examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Prompts) - ~5 examples that illustrate how to use Prompt as an integrated workflow for integrating knowledge sources, managing prompt history, and applying fact-checking.  
   - [SLIM-Agents examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/SLIM-Agents) - ~20 examples showing how to build multi-model, multi-step Agent processes using locally-running SLIM function calling models.  
   - [Fast start example #3-Prompts and Models](https://github.com/llmware-ai/llmware/blob/main/fast_start/rag/example-3-prompts_and_models.py) - getting started with model inference.

---

---
layout: default
title: Introduction by Examples 
parent: Examples
nav_order: 9
permalink: /examples/getting_started
---
# Introduction by Examples
We introduce ``llmware`` through self-contained examples.


# Your first library and query

{: .note }
> The code here is a modified version from [example-1-create_first_library.py](https://github.com/llmware-ai/llmware/blob/main/fast_start/example-1-create_first_library.py).
> The adjustments are made to ease understanding for this post.

In this introduction, we will walk through the steps of creating a **library**.
To create a ``library`` in ``llmware`` we have to instantiate a ``library`` object and call
the ``add_files`` method, which will parse the files, chunk up the text and also index it.
We will also download the samples files we provide, which can be used for any experimentation you
might want to do.


**Configuring llmware**

Before we get started, we can influence the configuration of ``llmware``.
For example, we can decide on which **text collection** data base to use, and on the logging level.
By default, ``llmware`` uses MongoDB as the text collection data base and has a ``debug_mode`` level
of ``0``.
This means that by default, ``llmware`` will show the status manager and print errors.
The status manager is useful for large parsing jobs.
In this ``library`` introduction, we will change the text collection data base as well as the ``debug_mode``.
As the text collection data base, we will choose ``sqlite``.
And we will change the ``debug_mode`` to ``2``, which will show the file name that is being parsed, i.e. a file-by-file progress.
```python
from llmware.configs import LLMWareConfig

LLMWareConfig().set_active_db("sqlite")
LLMWareConfig().set_config("debug_mode", 2)
```

**Downloading sample files**

We start by downloading the sample files we need.
``llmware`` provides a set of sample files which we use throughout our examples.
The following code snippet downloads these sample files, and in doing so creates the directories
*Agreements*, *Invoices*, *UN-Resolutions-500*, *SmallLibrary*, *FinDocs*, and *AgreementsLarge*.
If you want to get the newest version of the sample files, you can set ``over_write=True``.
However, we encourage you to try it out with your own files once you are comfortable enough with ``llmware``.
```python
from llmware.setup import Setup

sample_files_path = Setup().load_sample_files(over_write=False)
```
``sample_files_path`` is the path where the files are stores.
Assume that your use name is ``foo``, then on Linux the path would be ``'/home/foo/llmware_data/sample_files'.``


**Creating a library**

Now that we have data, we can start to create our library.
In ``llmware``, a **library** is a collection of unstructured data.
Currently, ``llmware`` supports *text* and *images*.
The following code creates an empty ``library`` with the name ``my_llmware_library``.
```python
from llmware.library import Library

library = Library().create_new_library('my_llmware_library')
```

**Adding files to a library**

Now that we have created a ``library``, we are ready to *add files* to it.
Currently, the ``add_files`` method supports pdf, pptx, docx, xlsx, csv, md, txt, json, wav, and zip, jpg, and png.
The method will automatically choose the correct parser, based on the file extension.
```python
library.add_files('/home/foo/llmware_data/sample_files/Agreements')
```

**The library card**

A ``library`` keeps inventory of its files, similar to a good librarian.
We do this with a *library card*.
At the moment of this writing, a library card has the keys _id, library_name, embedding, knowledge_graph, unique_doc_id, documents, blocks, images, pages, tables, and account_name.
```python
updated_library_card = library.get_library_card()
doc_count = updated_library_card["documents"]
block_count = updated_library_card["blocks"]
library_card.keys()
```

You can also get where the library is stored via the ``library_main_path`` attribute.
Again, assuming your user name is *foo* and you are on a Linux system, then the ``library_path`` is ``'/home/foo/llmware_data/accounts/llmware/my_lib'``.
```python
library.library_main_path
```

**Querying a library**

Finally, we are ready to execute a query against our library.
Remember that the text is indexed automatically when we add it to the library.
The result of a ``Query`` is a list of dictionaries, where one dictionary is one result.
A result dictionary has a wide range of useful keys.
A few important keys in the dictionary are *text*, *file_source*, *page_num*, *doc_ID*, *block_ID*, and
*matches*.
In the following, we query the library for the base salary, return the first ten results, and
iterate over the results.
```python
query_results = Query(library).text_query('base salary', result_count=10)

for query_result in query_results:
    text = query_result["text"]
    file_source = query_result["file_source"]
    page_number = query_result["page_num"]
    doc_id = query_result["doc_ID"]
    block_id = query_result["block_ID"]
    matches = query_result["matches"]
```

You can take a look at all the keys that are returned by calling ``keys()``.
```python
query_results[0].keys()
```

---

---
layout: default
title: Models
parent: Examples
nav_order: 3
description: overview of the major modules and classes of LLMWare  
permalink: /examples/models
---
# Models

We introduce ``llmware`` through self-contained examples.

```python


""" This example demonstrates prompting local BLING models with provided context - easy to select among different
BLING models between 1B - 4B, including both Pytorch versions and GGUF quantized versions, and to swap out the
hello_world questions with your own test set.

    NOTE: if you are running on a CPU with limited memory (e.g., <16 GB of RAM), we would recommend sticking to
the 1B parameter models, or using the quantized GGUF versions.  You may get out-of-memory errors and/or very
slow performance with ~3B parameter Pytorch models.  Even with 16 GB+ of RAM, the 3B Pytorch models should run but
will be slow (without GPU acceleration).  """


import time
from llmware.prompts import Prompt


def hello_world_questions():

    test_list = [

    {"query": "What is the total amount of the invoice?",
     "answer": "$22,500.00",
     "context": "Services Vendor Inc. \n100 Elm Street Pleasantville, NY \nTO Alpha Inc. 5900 1st Street "
                "Los Angeles, CA \nDescription Front End Engineering Service $5000.00 \n Back End Engineering"
                " Service $7500.00 \n Quality Assurance Manager $10,000.00 \n Total Amount $22,500.00 \n"
                "Make all checks payable to Services Vendor Inc. Payment is due within 30 days."
                "If you have any questions concerning this invoice, contact Bia Hermes. "
                "THANK YOU FOR YOUR BUSINESS!  INVOICE INVOICE # 0001 DATE 01/01/2022 FOR Alpha Project P.O. # 1000"},

    {"query": "What was the amount of the trade surplus?",
     "answer": "62.4 billion yen ($416.6 million)",
     "context": "Japan’s September trade balance swings into surplus, surprising expectations"
                "Japan recorded a trade surplus of 62.4 billion yen ($416.6 million) for September, "
                "beating expectations from economists polled by Reuters for a trade deficit of 42.5 "
                "billion yen. Data from Japan’s customs agency revealed that exports in September "
                "increased 4.3% year on year, while imports slid 16.3% compared to the same period "
                "last year. According to FactSet, exports to Asia fell for the ninth straight month, "
                "which reflected ongoing China weakness. Exports were supported by shipments to "
                "Western markets, FactSet added. — Lim Hui Jie"},

    {"query": "What was Microsoft's revenue in the 3rd quarter?",
     "answer": "$52.9 billion",
     "context": "Microsoft Cloud Strength Drives Third Quarter Results \nREDMOND, Wash. — April 25, 2023 — "
                "Microsoft Corp. today announced the following results for the quarter ended March 31, 2023,"
                " as compared to the corresponding period of last fiscal year:\n· Revenue was $52.9 billion"
                " and increased 7% (up 10% in constant currency)\n· Operating income was $22.4 billion "
                "and increased 10% (up 15% in constant currency)\n· Net income was $18.3 billion and "
                "increased 9% (up 14% in constant currency)\n· Diluted earnings per share was $2.45 "
                "and increased 10% (up 14% in constant currency).\n"},

    {"query": "When did the LISP machine market collapse?",
     "answer": "1987.",
     "context": "The attendees became the leaders of AI research in the 1960s."
                "  They and their students produced programs that the press described as 'astonishing': "
                "computers were learning checkers strategies, solving word problems in algebra, "
                "proving logical theorems and speaking English.  By the middle of the 1960s, research in "
                "the U.S. was heavily funded by the Department of Defense and laboratories had been "
                "established around the world. Herbert Simon predicted, 'machines will be capable, "
                "within twenty years, of doing any work a man can do'.  Marvin Minsky agreed, writing, "
                "'within a generation ... the problem of creating 'artificial intelligence' will "
                "substantially be solved'. They had, however, underestimated the difficulty of the problem.  "
                "Both the U.S. and British governments cut off exploratory research in response "
                "to the criticism of Sir James Lighthill and ongoing pressure from the US Congress "
                "to fund more productive projects. Minsky's and Papert's book Perceptrons was understood "
                "as proving that artificial neural networks approach would never be useful for solving "
                "real-world tasks, thus discrediting the approach altogether.  The 'AI winter', a period "
                "when obtaining funding for AI projects was difficult, followed.  In the early 1980s, "
                "AI research was revived by the commercial success of expert systems, a form of AI "
                "program that simulated the knowledge and analytical skills of human experts. By 1985, "
                "the market for AI had reached over a billion dollars. At the same time, Japan's fifth "
                "generation computer project inspired the U.S. and British governments to restore funding "
                "for academic research. However, beginning with the collapse of the Lisp Machine market "
                "in 1987, AI once again fell into disrepute, and a second, longer-lasting winter began."},

    {"query": "When will employment start?",
     "answer": "April 16, 2012.",
     "context": "THIS EXECUTIVE EMPLOYMENT AGREEMENT (this “Agreement”) is entered "
                "into this 2nd day of April, 2012, by and between Aphrodite Apollo "
                "(“Executive”) and TestCo Software, Inc. (the “Company” or “Employer”), "
                "and shall become effective upon Executive’s commencement of employment "
                "(the “Effective Date”) which is expected to commence on April 16, 2012. "
                "The Company and Executive agree that unless Executive has commenced "
                "employment with the Company as of April 16, 2012 (or such later date as "
                "agreed by each of the Company and Executive) this Agreement shall be "
                "null and void and of no further effect."},

    {"query": "What is the current rate on 10-year treasuries?",
     "answer": "4.58%",
     "context": "Stocks rallied Friday even after the release of stronger-than-expected U.S. jobs data "
                "and a major increase in Treasury yields.  The Dow Jones Industrial Average gained 195.12 points, "
                "or 0.76%, to close at 31,419.58. The S&P 500 added 1.59% at 4,008.50. The tech-heavy "
                "Nasdaq Composite rose 1.35%, closing at 12,299.68. The U.S. economy added 438,000 jobs in "
                "August, the Labor Department said. Economists polled by Dow Jones expected 273,000 "
                "jobs. However, wages rose less than expected last month.  Stocks posted a stunning "
                "turnaround on Friday, after initially falling on the stronger-than-expected jobs report. "
                "At its session low, the Dow had fallen as much as 198 points; it surged by more than "
                "500 points at the height of the rally. The Nasdaq and the S&P 500 slid by 0.8% during "
                "their lowest points in the day.  Traders were unclear of the reason for the intraday "
                "reversal. Some noted it could be the softer wage number in the jobs report that made "
                "investors rethink their earlier bearish stance. Others noted the pullback in yields from "
                "the day’s highs. Part of the rally may just be to do a market that had gotten extremely "
                "oversold with the S&P 500 at one point this week down more than 9% from its high earlier "
                "this year.  Yields initially surged after the report, with the 10-year Treasury rate trading "
                "near its highest level in 14 years. The benchmark rate later eased from those levels, but "
                "was still up around 6 basis points at 4.58%.  'We’re seeing a little bit of a give back "
                "in yields from where we were around 4.8%. [With] them pulling back a bit, I think that’s "
                "helping the stock market,' said Margaret Jones, chief investment officer at Vibrant Industries "
                "Capital Advisors. 'We’ve had a lot of weakness in the market in recent weeks, and potentially "
                "some oversold conditions.'"},

    {"query": "What is the governing law?",
     "answer": "State of Massachusetts",
     "context": "19.	Governing Law and Procedures. This Agreement shall be governed by and interpreted "
                 "under the laws of the State of Massachusetts, except with respect to Section 18(a) of this Agreement,"
                 " which shall be governed by the laws of the State of Delaware, without giving effect to any "
                 "conflict of laws provisions. Employer and Executive each irrevocably and unconditionally "
                 "(a) agrees that any action commenced by Employer for preliminary and permanent injunctive relief "
                 "or other equitable relief related to this Agreement or any action commenced by Executive pursuant "
                 "to any provision hereof, may be brought in the United States District Court for the federal "
                 "district in which Executive’s principal place of employment is located, or if such court does "
                 "not have jurisdiction or will not accept jurisdiction, in any court of general jurisdiction "
                 "in the state and county in which Executive’s principal place of employment is located, "
                 "(b) consents to the non-exclusive jurisdiction of any such court in any such suit, action o"
                 "r proceeding, and (c) waives any objection which Employer or Executive may have to the "
                 "laying of venue of any such suit, action or proceeding in any such court. Employer and "
                 "Executive each also irrevocably and unconditionally consents to the service of any process, "
                 "pleadings, notices or other papers in a manner permitted by the notice provisions of Section 8."},

    {"query": "What is the amount of the base salary?",
     "answer": "$200,000.",
     "context": "2.2. Base Salary. For all the services rendered by Executive hereunder, during the "
                 "Employment Period, Employer shall pay Executive a base salary at the annual rate of "
                 "$200,000, payable semimonthly in accordance with Employer’s normal payroll practices. "
                 "Executive’s base salary shall be reviewed annually by the Board (or the compensation committee "
                 "of the Board), pursuant to Employer’s normal compensation and performance review policies "
                 "for senior level executives, and may be increased but not decreased. The amount of any "
                 "increase for each year shall be determined accordingly. For purposes of this Agreement, "
                 "the term “Base Salary” shall mean the amount of Executive’s base salary established "
                 "from time to time pursuant to this Section 2.2. "},

    {"query": "Is the expected gross margin greater than 70%?",
     "answer": "Yes, between 71.5% and 72.%",
     "context": "Outlook NVIDIA’s outlook for the third quarter of fiscal 2024 is as follows:"
                "Revenue is expected to be $16.00 billion, plus or minus 2%. GAAP and non-GAAP "
                "gross margins are expected to be 71.5% and 72.5%, respectively, plus or minus "
                "50 basis points.  GAAP and non-GAAP operating expenses are expected to be "
                "approximately $2.95 billion and $2.00 billion, respectively.  GAAP and non-GAAP "
                "other income and expense are expected to be an income of approximately $100 "
                "million, excluding gains and losses from non-affiliated investments. GAAP and "
                "non-GAAP tax rates are expected to be 14.5%, plus or minus 1%, excluding any discrete items."
                "Highlights NVIDIA achieved progress since its previous earnings announcement "
                "in these areas:  Data Center Second-quarter revenue was a record $10.32 billion, "
                "up 141% from the previous quarter and up 171% from a year ago. Announced that the "
                "NVIDIA® GH200 Grace™ Hopper™ Superchip for complex AI and HPC workloads is shipping "
                "this quarter, with a second-generation version with HBM3e memory expected to ship "
                "in Q2 of calendar 2024. "},

    {"query": "What is Bank of America's rating on Target?",
     "answer": "Buy",
     "context": "Here are some of the tickers on my radar for Thursday, Oct. 12, taken directly from "
                "my reporter’s notebook: It’s the one-year anniversary of the S&P 500′s bear market bottom "
                "of 3,577. Since then, as of Wednesday’s close of 4,376, the broad market index "
                "soared more than 22%.  Hotter than expected September consumer price index, consumer "
                "inflation. The Social Security Administration issues announced a 3.2% cost-of-living "
                "adjustment for 2024.  Chipotle Mexican Grill (CMG) plans price increases. Pricing power. "
                "Cites consumer price index showing sticky retail inflation for the fourth time "
                "in two years. Bank of America upgrades Target (TGT) to buy from neutral. Cites "
                "risk/reward from depressed levels. Traffic could improve. Gross margin upside. "
                "Merchandising better. Freight and transportation better. Target to report quarter "
                "next month. In retail, the CNBC Investing Club portfolio owns TJX Companies (TJX), "
                "the off-price juggernaut behind T.J. Maxx, Marshalls and HomeGoods. Goldman Sachs "
                "tactical buy trades on Club names Wells Fargo (WFC), which reports quarter Friday, "
                "Humana (HUM) and Nvidia (NVDA). BofA initiates Snowflake (SNOW) with a buy rating."
                "If you like this story, sign up for Jim Cramer’s Top 10 Morning Thoughts on the "
                "Market email newsletter for free. Barclays cuts price targets on consumer products: "
                "UTZ Brands (UTZ) to $16 per share from $17. Kraft Heinz (KHC) to $36 per share from "
                "$38. Cyclical drag. J.M. Smucker (SJM) to $129 from $160. Secular headwinds. "
                "Coca-Cola (KO) to $59 from $70. Barclays cut PTs on housing-related stocks: Toll Brothers"
                "(TOL) to $74 per share from $82. Keeps underweight. Lowers Trex (TREX) and Azek"
                "(AZEK), too. Goldman Sachs (GS) announces sale of fintech platform and warns on "
                "third quarter of 19-cent per share drag on earnings. The buyer: investors led by "
                "private equity firm Sixth Street. Exiting a mistake. Rise in consumer engagement for "
                "Spotify (SPOT), says Morgan Stanley. The analysts hike price target to $190 per share "
                "from $185. Keeps overweight (buy) rating. JPMorgan loves elf Beauty (ELF). Keeps "
                "overweight (buy) rating but lowers price target to $139 per share from $150. "
                "Sees “still challenging” environment into third-quarter print. The Club owns shares "
                "in high-end beauty company Estee Lauder (EL). Barclays upgrades First Solar (FSLR) "
                "to overweight from equal weight (buy from hold) but lowers price target to $224 per "
                "share from $230. Risk reward upgrade. Best visibility of utility scale names."},

    {"query": "Who is NVIDIA's partner for the driver assistance system?",
     "answer": "MediaTek",
     "context":   "Automotive Second-quarter revenue was $253 million, down 15% from the previous "
                  "quarter and up 15% from a year ago. Announced that NVIDIA DRIVE Orin™ is powering "
                  "the new XPENG G6 Coupe SUV’s intelligent advanced driver assistance system. "
                  "Partnered with MediaTek, which will develop mainstream automotive systems on "
                  "chips for global OEMs, which integrate new NVIDIA GPU chiplet IP for AI and graphics."},

    {"query": "What was the rate of decline in 3rd quarter sales?",
     "answer": "20% year-on-year.",
     "context": "Nokia said it would cut up to 14,000 jobs as part of a cost cutting plan following "
                "third quarter earnings that plunged. The Finnish telecommunications giant said that "
                "it will reduce its cost base and increase operation efficiency to “address the "
                "challenging market environment. The substantial layoffs come after Nokia reported "
                "third-quarter net sales declined 20% year-on-year to 4.98 billion euros. Profit over "
                "the period plunged by 69% year-on-year to 133 million euros."},

    {"query": "What was professional visualization revenue in the quarter?",
     "answer": "$379 million",
     "context": "Gaming Second-quarter revenue was $2.49 billion, up 11% from the previous quarter and up "
                "22% from a year ago. Began shipping the GeForce RTX™ 4060 family of GPUs, "
                "bringing to gamers NVIDIA Ada Lovelace architecture and DLSS, starting at $299."
                "Announced NVIDIA Avatar Cloud Engine, or ACE, for Games, a custom AI model "
                "foundry service using AI-powered natural language interactions to transform games "
                "by bringing intelligence to non-playable characters. Added 35 DLSS games, including "
                "Diablo IV, Ratchet & Clank: Rift Apart, Baldur’s Gate 3 and F1 23, as well as Portal: "
                "Prelude RTX, a path-traced game made by the community using NVIDIA’s RTX Remix creator tool."
                "Professional Visualization Second-quarter revenue was $379 million, up 28% from the "
                "previous quarter and down 24% from a year ago.  Announced three new desktop "
                "workstation RTX GPUs based on the Ada Lovelace architecture — NVIDIA RTX 5000, RTX 4500 "
                "and RTX 4000 — to deliver the latest AI, graphics and real-time rendering, which are "
                "shipping this quarter. Announced a major release of the NVIDIA Omniverse platform, "
                "with new foundation applications and services for developers and industrial "
                "enterprises to optimize and enhance their 3D pipelines with OpenUSD and "
                "generative AI.  Joined with Pixar, Adobe, Apple and Autodesk to form the "
                "Alliance for OpenUSD to promote the standardization, development, evolution and "
                "growth of Universal Scene Description technology."},


    {"query": "What is the executive's title?",
     "answer": "Senior Vice President, Event Planning ('SVP') of the Workforce Optimization Division.",
     "context": "2.1. Duties and Responsibilities and Extent of Service. During the Employment Period, "
                 "Executive shall serve as Senior Vice President, Event Planning (“SVP”) of the Employer’s "
                 "Workforce Optimization Division. In such role, Executive will report to the Board of "
                 "Directors of Employer (the “Board”) and shall devote substantially all of his business time "
                 "and attention and his best efforts and ability to the operations of Employer and its subsidiaries. "
                 "Executive shall be responsible for running Employer’s day-to-day operations and shall perform "
                 "faithfully, diligently and competently the duties and responsibilities of a SVP and such other "
                 "duties and responsibilities as directed by the Board and are consistent with such position. "
                 "The foregoing shall not be construed as preventing Executive from (a) making passive "
                 "investments in other businesses or enterprises consistent with Employer’s code of conduct, "
                 "or (b) engaging in any other business activity consistent with Employer’s code of conduct; "
                 "provided that Executive seeks and obtains the prior approval of the Board before engaging "
                 "in any other business activity. In addition, it shall not be a violation of this Agreement "
                 "for Executive to participate in civic or charitable activities, deliver lectures, fulfill "
                 "speaking engagements, teach at educational institutions, and/or manage personal investments "
                 "(subject to the immediately preceding sentence); provided that such activities do not "
                 "interfere in any substantial respect with the performance of Executive’s responsibilities "
                 "as an employee in accordance with this Agreement. Executive may also serve on one or more "
                 "corporate boards of another company (and committees thereof) upon giving advance notice "
                 "to the Board prior to commencing service on any other corporate board."},

        {"query": "According to the CFO, what led to the increase in cloud revenue?",
         "answer": "Focused execution by our sales teams and partners",
         "context": "'The world's most advanced AI models "
                  "are coming together with the world's most universal user interface - natural language - "
                  "to create a new era of computing,' said Satya Nadella, chairman and chief "
                  "executive officer of Microsoft. 'Across the Microsoft Cloud, we are the platform "
                  "of choice to help customers get the most value out of their digital spend and innovate "
                  "for this next generation of AI.' 'Focused execution by our sales teams and partners "
                  "in this dynamic environment resulted in Microsoft Cloud revenue of $28.5 billion, "
                  "up 22% (up 25% in constant currency) year-over-year,' said Amy Hood, executive "
                  "vice president and chief financial officer of Microsoft.\n"},

    {"query": "Which company is located in Nevada?",
     "answer": "North Industries",
     "context": "To send notices to Blue Moon Tech, mail to their headquarters at: "
                "555 California Street, San Francisco, California 94123. To send notices to North Industries, mail to"
                "their principal U.S. offices at: 19832 32nd Avenue, Las Vegas, Nevada 23593.\nTo send notices "
                "to Red River Industries, send to: One Red River Road, Stamford, Connecticut 08234."},

    {"query": "When can termination after a material breach occur?",
     "answer": "If the breach is not cured within 15 days of notice of the breach.",
     "context": "This Agreement shall remain in effect until terminated. Either party may terminate this "
             "agreement, any Statement of Work or Services Description for convenience by giving the other "
             "party 30 days written notice. Either party may terminate this Agreement or any work order or "
             "services description if the other party is in material breach or default of any obligation "
             "that is not cured within 15 days’ notice of such breach. The TestCo agrees to pay all fees "
             "for services performed and expenses incurred prior to the termination of this Agreement. "
             "Termination of this Agreement will terminate all outstanding Statement of Work or Services "
             "Description entered into under this agreement."},

    {"query": "What is a headline summary in 10 words or less?",
     "answer": "Joe Biden is the 46th President of the United States.",
     "context": "Joe Biden's tenure as the 46th president of the United States began with "
                "his inauguration on January 20, 2021. Biden, a Democrat from Delaware who "
                "previously served as vice president under Barack Obama, "
                "took office following his victory in the 2020 presidential election over "
                "Republican incumbent president Donald Trump. Upon his inauguration, he "
                "became the oldest president in American history."},

    {"query": "Who are the two people that won elections in Georgia?",
     "answer": "Jon Ossoff and Raphael Warnock",
     "context": "Though Biden was generally acknowledged as the winner, "
                "General Services Administration head Emily W. Murphy "
                 "initially refused to begin the transition to the president-elect, "
                 "thereby denying funds and office space to his team. "
                 "On November 23, after Michigan certified its results, Murphy "
                 "issued the letter of ascertainment, granting the Biden transition "
                 "team access to federal funds and resources for an orderly transition. "
                 "Two days after becoming the projected winner of the 2020 election, "
                 "Biden announced the formation of a task force to advise him on the "
                 "COVID-19 pandemic during the transition, co-chaired by former "
                 "Surgeon General Vivek Murthy, former FDA commissioner David A. Kessler, "
                 "and Yale University's Marcella Nunez-Smith. On January 5, 2021, "
                 "the Democratic Party won control of the United States Senate, "
                 "effective January 20, as a result of electoral victories in "
                 "Georgia by Jon Ossoff in a runoff election for a six-year term "
                 "and Raphael Warnock in a special runoff election for a two-year term. "
                 "President-elect Biden had supported and campaigned for both "
                 "candidates prior to the runoff elections on January 5.On January 6, "
                 "a mob of thousands of Trump supporters violently stormed the Capitol "
                 "in the hope of overturning Biden's election, forcing Congress to "
                 "evacuate during the counting of the Electoral College votes. More "
                 "than 26,000 National Guard members were deployed to the capital "
                 "for the inauguration, with thousands remaining into the spring."},

        {"query": "What is the list of the top financial highlights for the quarter?",
         "answer": "•Revenue: $52.9 million, up 10% in constant currency;\n"
                   "•Operating income: $22.4 billion, up 15% in constant currency;\n"
                   "•Net income: $18.3 billion, up 14% in constant currency;\n"
                   "•Diluted earnings per share: $2.45 billion, up 14% in constant currency.",
         "context": "Microsoft Cloud Strength Drives Third Quarter Results \nREDMOND, Wash. — April 25, 2023 — "
                    "Microsoft Corp. today announced the following results for the quarter ended March 31, 2023,"
                    " as compared to the corresponding period of last fiscal year:\n· Revenue was $52.9 billion"
                    " and increased 7% (up 10% in constant currency)\n· Operating income was $22.4 billion "
                    "and increased 10% (up 15% in constant currency)\n· Net income was $18.3 billion and "
                    "increased 9% (up 14% in constant currency)\n· Diluted earnings per share was $2.45 "
                    "and increased 10% (up 14% in constant currency).\n"},

    {"query": "What is a list of the key points?",
     "answer": "•Stocks rallied on Friday with stronger-than-expected U.S jobs data and increase in "
               "Treasury yields;\n•Dow Jones gained 195.12 points;\n•S&P 500 added 1.59%;\n•Nasdaq Composite rose "
               "1.35%;\n•U.S. economy added 438,000 jobs in August, better than the 273,000 expected;\n"
               "•10-year Treasury rate trading near the highest level in 14 years at 4.58%.",
     "context": "Stocks rallied Friday even after the release of stronger-than-expected U.S. jobs data "
               "and a major increase in Treasury yields.  The Dow Jones Industrial Average gained 195.12 points, "
               "or 0.76%, to close at 31,419.58. The S&P 500 added 1.59% at 4,008.50. The tech-heavy "
               "Nasdaq Composite rose 1.35%, closing at 12,299.68. The U.S. economy added 438,000 jobs in "
               "August, the Labor Department said. Economists polled by Dow Jones expected 273,000 "
               "jobs. However, wages rose less than expected last month.  Stocks posted a stunning "
               "turnaround on Friday, after initially falling on the stronger-than-expected jobs report. "
               "At its session low, the Dow had fallen as much as 198 points; it surged by more than "
               "500 points at the height of the rally. The Nasdaq and the S&P 500 slid by 0.8% during "
               "their lowest points in the day.  Traders were unclear of the reason for the intraday "
               "reversal. Some noted it could be the softer wage number in the jobs report that made "
               "investors rethink their earlier bearish stance. Others noted the pullback in yields from "
               "the day’s highs. Part of the rally may just be to do a market that had gotten extremely "
               "oversold with the S&P 500 at one point this week down more than 9% from its high earlier "
               "this year.  Yields initially surged after the report, with the 10-year Treasury rate trading "
               "near its highest level in 14 years. The benchmark rate later eased from those levels, but "
               "was still up around 6 basis points at 4.58%.  'We’re seeing a little bit of a give back "
               "in yields from where we were around 4.8%. [With] them pulling back a bit, I think that’s "
               "helping the stock market,' said Margaret Jones, chief investment officer at Vibrant Industries "
               "Capital Advisors. 'We’ve had a lot of weakness in the market in recent weeks, and potentially "
               "some oversold conditions.'"}

    ]

    return test_list


def bling_meets_llmware_hello_world (model_name):

    """ Simple inference loop that loads a model and runs through a series of test questions. """

    t0 = time.time()
    test_list = hello_world_questions()

    print(f"\n > Loading Model: {model_name}...")

    prompter = Prompt().load_model(model_name)

    t1 = time.time()
    print(f"\n > Model {model_name} load time: {t1-t0} seconds")
 
    for i, entries in enumerate(test_list):
        print(f"\n{i+1}. Query: {entries['query']}")
     
        # run the prompt
        output = prompter.prompt_main(entries["query"],context=entries["context"]
                                      , prompt_name="default_with_context",temperature=0.30)

        llm_response = output["llm_response"].strip("\n")
        print(f"LLM Response: {llm_response}")
        print(f"Gold Answer: {entries['answer']}")
        print(f"LLM Usage: {output['usage']}")

    t2 = time.time()
    print(f"\nTotal processing time: {t2-t1} seconds")

    return 0


if __name__ == "__main__":

    # list of 'rag-instruct' laptop-ready bling models on HuggingFace

    model_list = ["llmware/bling-1b-0.1",
                  "llmware/bling-tiny-llama-v0",
                  "llmware/bling-1.4b-0.1",
                  "llmware/bling-falcon-1b-0.1",
                  "llmware/bling-cerebras-1.3b-0.1",
                  "llmware/bling-sheared-llama-1.3b-0.1",
                  "llmware/bling-sheared-llama-2.7b-0.1",
                  "llmware/bling-red-pajamas-3b-0.1",
                  "llmware/bling-stable-lm-3b-4e1t-v0",
                  "llmware/bling-phi-3",

                  # use GGUF models too
                  "bling-phi-3-gguf",           # quantized bling-phi-3
                  "bling-answer-tool",          # quantized bling-tiny-llama
                  "bling-stablelm-3b-tool"      # quantized bling-stablelm-3b
                  ]

    #   try the newest bling model - 'tiny-llama'
    bling_meets_llmware_hello_world(model_list[1])
```

For more examples, see the [models examples]((https://www.github.com/llmware-ai/llmware/tree/main/examples/Models/) in the main repo.   

Check back often - we are updating these examples regularly - and many of these examples have companion videos as well.  


# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://huggingface.co/llmware"><span><img src="assets/images/hf-logo.svg" alt="Hugging Face" class="hugging-face-logo"/></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Notebooks
parent: Examples
nav_order: 11
description: overview of the major modules and classes of LLMWare  
permalink: /examples/notebooks
---
# Notebooks - Introduction by Examples
We introduce ``llmware`` through self-contained examples.

# Understanding Google Colab and Jupyter Notebooks

Welcome to our project documentation! A common point of confusion among developers new to data science and machine learning workflows is the relationship and differences between Google Colab and Jupyter Notebooks. This README aims to clarify these parts to ensure everyone is on the same page.

## What are Jupyter Notebooks?

Jupyter Notebooks is an open-source web application that lets you create and share documents that have live code, equations, visualizations, and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more.

## What is Google Colab?

Google Colab (or Colaboratory) is a free Jupyter notebook environment that requires no setup and runs in the cloud. It offers a similar interface to Jupyter Notebooks and lets users write and execute Python in a web browser. Google Colab also provides free access to computing resources, including GPUs and TPUs, making it highly popular for machine learning and data analysis projects.

## Key Similarities

- **Interface:** Both platforms use the Jupyter Notebook interface, which supports mixing executable code, equations, visualizations, and narrative text in a single document.
- **Language Support:** Primarily, both are used for executing Python code. However, Jupyter Notebooks support other languages such as R and Julia.
- **Use Cases:** They are widely used for data analysis, machine learning, and education, allowing for easy sharing of results and methodologies.

## Increase Google Colab Computational Power with T4 GPU

Our models are designed to run on at least 16GB of RAM. By default Google Colab provides ~13GB of RAM, which significantly slows computational speed. To ensure the best performance when using our models, we highly recommend enabling the T4 GPU in Colab. This will provide the notebook with additional resources, including 16GB of RAM, allowing our models to run smoothly and efficiently.

Steps to enabling T4 GPU in Colab:
1. In your Colab notebook, click on the "Runtime" tab
2. Select "Change runtime type"
3. Under "Hardware Accelerator", select T4 GPU

NOTE: There is a weekly usage limit on using T4 for free.

## Key Differences

- **Execution Environment:** Jupyter Notebooks can be run locally on your machine or on a server, but Google Colab is hosted in the cloud.
- **Access to Resources:** Google Colab provides free access to hardware accelerators (GPUs and TPUs) which is not inherently available in Jupyter Notebooks unless specifically set up by the user on their servers.
- **Collaboration:** Google Colab offers easier collaboration features, similar to Google Docs, letting multiple users work on the same notebook simultaneously.

## Conclusion

While Google Colab and Jupyter Notebooks might seem different they are built on the same idea and offer similar functionalities with a few distinctions, mainly in execution environment and access to computing resources. Understanding these platforms' capabilities can significantly enhance your data science and machine learning projects.

We hope this guide has helped clarify the similarities and differences between Google Colab and Jupyter Notebooks. Happy coding!

---

---
layout: default
title: Parsing
parent: Examples
nav_order: 4
description: overview of the major modules and classes of LLMWare  
permalink: /examples/parsing
---
# Parsing - Introduction by Examples
We introduce ``llmware`` through self-contained examples.


🚀 Parsing Examples  🚀  
===============

**Parsing is the Humble Hero of Good RAG Pipelines**    

LLMWare supports parsing of a wide range of unstructured content types, and views parsing, text chunking and indexing as the first step in the pipeline, and like any pipeline, care and attention to getting "great input" is usually the key to "great output."  

In this repository, we show several key features of parsing with llmware:  


**Parsing PDFs like a Pro**  

- Configuring text chunking and extraction parameters - [**PDF Configuration**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/pdf_parser_new_configs.py)  

- PDF Table extraction - [**PDF Table**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/pdf_table_extraction.py)  

- Fallback to OCR - [**PDF-by-OCR**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/parse_pdf_by_ocr.py)  


**Parsing Office Documents (Powerpoints, Word, Excel)**  

- Configuring text chunking and extraction parameters - [**Office Configuration**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/office_parser_new_configs.py)  

- Handling ZIPs and mixed file types - [**Microsoft IR Documents**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/parsing_microsoft_ir_docs.py)  

- Running OCR on Images Extracted - [**OCR Embedded Doc Images**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/ocr_embedded_doc_images.py)  


**Parsing without a Database**  

- Parse in Memory - [**Parse in Memory**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/parse_in_memory.py)  

- Parse directly into a Prompt - [**Parse in Prompt**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/parse_into_prompt.py)  

- Parse to JSON file - [**Parse to JSON**](https://www.github.com/llmware-ai/llmware/tree/main/examples/main/examples/Parsing/parse_to_json.py)  


**Other Content Types**  

- Custom CSV - [**Custom CSV files**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/parse_csv_custom.py)  

- Custom JSON - [**Custom JSON files**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/parse_jsonl_custom.py)  

- Images - [**OCR on Images**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/parse_images.py)  

- Web/HTML - [**Website Extraction**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/parse_web_sources_in_memory.py)  

- Voice (WAV) - in Use_Cases - [**Parsing Great Speeches**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/parsing_great_speeches.py)  

For more examples, see the [parsing examples]((https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/) in the main repo.   

Check back often - we are updating these examples regularly - and many of these examples have companion videos as well.  


### **Let's get started!  🚀**

---

---
layout: default
title: Prompts
parent: Examples
nav_order: 6
description: overview of the major modules and classes of LLMWare  
permalink: /examples/prompts
---
# Prompts - Introduction by Examples
We introduce ``llmware`` through self-contained examples.

# Basic RAG Scenario - Invoice Processing 

```python

""" This example shows an end-to-end scenario for invoice processing that can be run locally and without a
database.  The example shows how to combine the use of parsing combined with prompts_with_sources to rapidly
iterate through a batch of invoices and ask a set of questions, and then save the full output to both
(1) .jsonl for integration into an upstream application/database and (2) to a CSV for human review in excel.

    note: the sample code pulls from a public repo to load the sample invoice documents the first time -
    please feel free to substitute with your own invoice documents (PDF/DOCX/PPTX/XLSX/CSV/TXT) if you prefer.

    this example does not require a database or embedding

    this example can be run locally on a laptop by setting 'run_on_cpu=True'
    if 'run_on_cpu==False", then please see the example 'launch_llmware_inference_server.py'
    to configure and set up a 'pop-up' GPU inference server in just a few minutes
"""

import os
import re

from llmware.prompts import Prompt, HumanInTheLoop
from llmware.configs import LLMWareConfig
from llmware.setup import Setup
from llmware.models import ModelCatalog


def invoice_processing(run_on_cpu=True):

    #   Step 1 - Pull down the sample files from S3 through the .load_sample_files() command
    #   --note: if you need to refresh the sample files, set 'over_write=True'
    print("update: Downloading Sample Files")

    sample_files_path = Setup().load_sample_files(over_write=False)
    invoices_path = os.path.join(sample_files_path, "Invoices")

    #   Step 2 - simple sample query list - each question will be asked to each invoice
    query_list = ["What is the total amount of the invoice?",
                  "What is the invoice number?",
                  "What are the names of the two parties?"]

    #   Step 3 - Load Model

    if run_on_cpu:

        #   load local bling model that can run on cpu/laptop

        #   note: bling-1b-0.1 is the *fastest* & *smallest*, but will make more errors than larger BLING models
        # model_name = "llmware/bling-1b-0.1"

        #   try the new bling-phi-3 quantized with gguf - most accurate
        model_name = 'bling-phi-3-gguf'
    else:

        #   use GPU-based inference server to process
        #  *** see the launch_llmware_inference_server.py example script to setup ***

        server_uri_string = "http://11.123.456.789:8088"    # insert your server_uri_string
        server_secret_key = "demo-test"
        ModelCatalog().setup_custom_llmware_inference_server(server_uri_string, secret_key=server_secret_key)
        model_name = "llmware-inference-server"

    #   attach inference server to prompt object
    prompter = Prompt().load_model(model_name)

    #   Step 4 - main loop thru folder of invoices

    for i, invoice in enumerate(os.listdir(invoices_path)):

        #   just in case (legacy on mac os file system - not needed on linux or windows)
        if invoice != ".DS_Store":

            print("\nAnalyzing invoice: ", str(i + 1), invoice)

            for question in query_list:

                #   Step 4A - parses the invoices in memory and attaches as a source to the Prompt
                source = prompter.add_source_document(invoices_path,invoice)

                #   Step 4B - executes the prompt on the LLM (with the loaded source)
                output = prompter.prompt_with_source(question,prompt_name="default_with_context")

                for i, response in enumerate(output):
                    print("LLM Response - ", question, " - ", re.sub("[\n]"," ", response["llm_response"]))

                prompter.clear_source_materials()

    # Save jsonl report with full transaction history to /prompt_history folder
    print("\nupdate: prompt state saved at: ", os.path.join(LLMWareConfig.get_prompt_path(),prompter.prompt_id))

    prompter.save_state()

    # Generate CSV report for easy Human review in Excel
    csv_output = HumanInTheLoop(prompter).export_current_interaction_to_csv()

    print("\nupdate: csv output for human review - ", csv_output)

    return 0


if __name__ == "__main__":

    invoice_processing(run_on_cpu=True)
```

# Document Summarizer 

```python

""" This Example shows a packaged 'document_summarizer' prompt using the slim-summary-tool. It shows a variety of
techniques to summarize documents generally larger than a LLM context window, and how to assemble multiple source
batches from the document, as well as using a 'query' and 'topic' to focus on specific segments of the document. """

import os

from llmware.prompts import Prompt
from llmware.setup import Setup


def test_summarize_document(example="jd salinger"):

    # pull a sample document (or substitute a file_path and file_name of your own)
    sample_files_path = Setup().load_sample_files(over_write=False)

    topic = None
    query = None
    fp = None
    fn = None

    if example not in ["jd salinger", "employment terms", "just the comp", "un resolutions"]:
        print ("not found example")
        return []

    if example == "jd salinger":
        fp = os.path.join(sample_files_path, "SmallLibrary")
        fn = "Jd-Salinger-Biography.docx"
        topic = "jd salinger"
        query = None

    if example == "employment terms":
        fp = os.path.join(sample_files_path, "Agreements")
        fn = "Athena EXECUTIVE EMPLOYMENT AGREEMENT.pdf"
        topic = "executive compensation terms"
        query = None

    if example == "just the comp":
        fp = os.path.join(sample_files_path, "Agreements")
        fn = "Athena EXECUTIVE EMPLOYMENT AGREEMENT.pdf"
        topic = "executive compensation terms"
        query = "base salary"

    if example == "un resolutions":
        fp = os.path.join(sample_files_path, "SmallLibrary")
        fn = "N2126108.pdf"
        # fn = "N2137825.pdf"
        topic = "key points"
        query = None

    # optional parameters:  'query' - will select among blocks with the query term
    #                       'topic' - will pass a topic/issue as the parameter to the model to 'focus' the summary
    #                       'max_batch_cap' - caps the number of batches sent to the model
    #                       'text_only' - returns just the summary text aggregated

    kp = Prompt().summarize_document_fc(fp, fn, topic=topic, query=query, text_only=True, max_batch_cap=15)

    print(f"\nDocument summary completed - {len(kp)} Points")
    for i, points in enumerate(kp):
        print(i, points)

    return 0


if __name__ == "__main__":

    print(f"\nExample: Summarize Documents\n")

    #   4 examples - ["jd salinger", "employment terms", "just the comp", "un resolutions"]
    #   -- "jd salinger" - summarizes key points about jd salinger from short biography document
    #   -- "employment terms" - summarizes the executive compensation terms across 15 page document
    #   -- "just the comp" - queries to find subset of document and then summarizes the key terms
    #   -- "un resolutions" - summarizes the un resolutions document

    summary_direct = test_summarize_document(example="employment terms")
```

For more examples, see the [prompt examples]((https://www.github.com/llmware-ai/llmware/tree/main/examples/Prompts/) in the main repo.   

Check back often - we are updating these examples regularly - and many of these examples have companion videos as well.  


# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://huggingface.co/llmware"><span><img src="assets/images/hf-logo.svg" alt="Hugging Face" class="hugging-face-logo"/></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Retrieval
parent: Examples
nav_order: 7
description: overview of the major modules and classes of LLMWare  
permalink: /examples/retrieval
---
# Retrieval - Introduction by Examples
We introduce ``llmware`` through self-contained examples.

# SEMANTIC Retrieval Example

```python

"""
This 'getting started' example demonstrates how to use basic semantic retrieval with the Query class
      1. Create a sample library
      2. Run a basic semantic query
      3. View the results
"""

import os
from llmware.library import Library
from llmware.retrieval import Query
from llmware.setup import Setup
from llmware.configs import LLMWareConfig


def create_fin_docs_sample_library(library_name):

    print(f"update: creating library - {library_name}")

    library = Library().create_new_library(library_name)
    sample_files_path = Setup().load_sample_files(over_write=False)
    ingestion_folder_path = os.path.join(sample_files_path, "FinDocs")
    parsing_output = library.add_files(ingestion_folder_path)

    print(f"update: building embeddings - may take a few minutes the first time")

    #   note: if you have installed Milvus or another vector DB, please feel free to substitute
    #   note: if you have any memory constraints on laptop:
    #       (1) reduce embedding batch_size or ...
    #       (2) substitute "mini-lm-sbert" as embedding model

    library.install_new_embedding(embedding_model_name="industry-bert-sec", vector_db="chromadb",batch_size=200)

    return library


def basic_semantic_retrieval_example (library):

    # Create a Query instance
    q = Query(library)

    # Set the keys that should be returned - optional - full set of keys will be returned by default
    q.query_result_return_keys = ["distance","file_source", "page_num", "text"]

    # perform a simple query
    my_query = "ESG initiatives"
    query_results1 = q.semantic_query(my_query, result_count=20)

    # Iterate through query_results, which is a list of result dicts
    print(f"\nQuery 1 -  {my_query}")
    for i, result in enumerate(query_results1):
        print("results - ", i, result)

    # perform another query
    my_query2 = "stock performance"
    query_results2 = q.semantic_query(my_query2, result_count=10)

    print(f"\nQuery 2 - {my_query2}")
    for i, result in enumerate(query_results2):
        print("results - ", i, result)

    # perform another query
    my_query3 = "cloud computing"

    # note: use of embedding_distance_threshold will cap results with distance < 1.0
    query_results3 = q.semantic_query(my_query3, result_count=50, embedding_distance_threshold=1.0)

    print(f"\nQuery 3 - {my_query3}")
    for i, result in enumerate(query_results3):
        print("result - ", i, result)

    return [query_results1, query_results2, query_results3]


if __name__ == "__main__":

    print(f"Example - Running a Basic Semantic Query")

    LLMWareConfig().set_active_db("sqlite")

    # step 1- will create library + embeddings with Financial Docs
    lib = create_fin_docs_sample_library("lib_semantic_query_1")

    # step 2- run query against the library and embeddings
    my_results = basic_semantic_retrieval_example(lib)
```

For more examples, see the [retrieval examples]((https://www.github.com/llmware-ai/llmware/tree/main/examples/Retrieval/) in the main repo.   

Check back often - we are updating these examples regularly - and many of these examples have companion videos as well.  


# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://huggingface.co/llmware"><span><img src="assets/images/hf-logo.svg" alt="Hugging Face" class="hugging-face-logo"/></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Structured Tables
parent: Examples
nav_order: 9
description: overview of the major modules and classes of LLMWare  
permalink: /examples/structured_tables
---
# Structured Tables - Introduction by Examples
We introduce ``llmware`` through self-contained examples.

```python

""" This example shows the basic recipe for creating a CustomTable with LLMWare and a few of the basic methods
    to quickly get started.

    In this example, we will build a very simple 'hello world' Files table, which we will build upon in a future
    example by aggregating a more interesting and useful set of attributes from a LLMWare Library collection.

    CustomTable is designed to work with the text collection databases supported by LLMWare:

        SQL DBs     ---     Postgres and SQLIte
        NoSQL DB    ---     Mongo DB

    Even though Mongo does not require a schema for inserting and retrieving information, the CustomTable method
    will expect a defined schema to be provided (good best practice, in any case).  """

from llmware.resources import CustomTable


def hello_world_custom_table():

    #   simple schema for a table to track Files/Documents
    #   note: the schema is a python dictionary, with named keys, and the value corresponding to the data type
    #   for sqlite and postgres, any standard sql data type should generally work

    files_schema = {"custom_doc_num": "integer",
                    "file_name": "text",
                    "comments": "text"}

    #   create a CustomTable object
    db_name = "sqlite"
    table_name = "files_table_1000"
    ct = CustomTable(db=db_name,table_name=table_name, schema=files_schema)

    #   insert a few sample rows - each row is a dictionary with keys from the schema, and the *actual* values
    r1 = {"custom_doc_num": 1, "file_name": "technical_manual.pdf", "comments": "very useful overview"}
    ct.write_new_record(r1)

    r2 = {"custom_doc_num": 2, "file_name": "work_presentation.pptx", "comments": "need to save for future reference"}
    ct.write_new_record(r2)

    r3 = {"custom_doc_num": 3, "file_name": "dataset.json", "comments": "will use in next project"}
    ct.write_new_record(r3)

    #   to see the entries - pull all items from the table
    all_results = ct.get_all()

    print("\nTEST #1 - Retrieving All Elements")
    for i, res in enumerate(all_results):
        print("results: ", i, res)

    #   look at the database schema
    schema = ct.get_schema()

    print("\nTEST #2 - Getting the Table Schema")
    print("schema: ", schema)

    schema_str = ct.sql_table_create_string()

    print("table create sql: ", schema_str)

    #   perform a basic lookup with 'key' and 'value'
    f = ct.lookup("custom_doc_num", 2)

    print("\nTEST #3 - Basic Lookup - 'custom_doc_num' = 2")
    print("lookup: ", f)

    #   if you prefer SQL, pass a SQL query directly (note: this will only work on Postgres and SQLite)

    if db_name == "sqlite":

        # note: our standard 'unpacking' of a row of sqlite includes the rowid attribute
        custom_query = f"SELECT rowid, * FROM {table_name} WHERE custom_doc_num = 3;"

    elif db_name == "postgres":
        custom_query = f"SELECT * FROM {table_name} WHERE custom_doc_num = 3;"

    elif db_name == "mongo":
        custom_query = {"custom_doc_num": 3}
    else:
        print("must use either sqlite, postgres or mongo")
        return -1

    cf = ct.custom_lookup(custom_query)

    print("\nTEST #4 - Custom SQL Lookup - 'custom_doc_num' = 3")
    print("custom query lookup: ", cf)

    print("\nTEST #5 - Making Updates and Deletes")

    #   to delete a record
    ct.delete_record("custom_doc_num", 1)
    print("deleted record")

    #   to update the values of a record
    ct.update_record({"custom_doc_num": 2}, "file_name", "work_presentation_update_v2.pptx")
    print("updated record")

    updated_all_results = ct.get_all()

    for i, res in enumerate(updated_all_results):
        print("updated results: ", i, res)

    print("\nTEST #6 - Delete Table - uncomment and set confirm=True")
    #   done?  delete the table and start over
    #   -- note: confirm=True must be set
    #   ct.delete_table(confirm=False)

    #   look at all tables in the database
    tables = ct.list_all_tables()

    print("\nTEST #7 - View all of the tables on the DB")
    for i, t in enumerate(tables):
        print("tables:" ,i, t)

    return 0


if __name__ == "__main__":

    hello_world_custom_table()
```


These examples illustrate the use of the CustomTable class to quickly create SQL tables that can be used in conjunction with LLM-based workflows.  

1.  [**Intro to CustomTables**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Structured_Tables/create_custom_table-1.py)  

    - Getting started with using CustomTables 

2.  [**Loading CSV into CustomTables**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Structured_Tables/loading_csv_into_custom_table-2a.py)  

    - Loading CSV into CustomTables

3.  [**Loading CSV into Library (Configured)**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Structured_Tables/loading_csv_w_config_options-2b.py)  

    - Loading CSV into Library  

4.  [**Loading JSON into CustomTables**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Stuctured_Tables/loading_json_custom_table-3a.py)  

    - Loading JSON into CustomTable database 

5   [**Loading JSON into Library (Configured)**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Stuctured_Tables/loading_json_w_config_options-3b.py)  

    - Loading JSON into a library with configuration  


For more examples, see the [structured tables example]((https://www.github.com/llmware-ai/llmware/tree/main/examples/Structured_Tables/) in the main repo.   

Check back often - we are updating these examples regularly - and many of these examples have companion videos as well.  


# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://huggingface.co/llmware"><span><img src="assets/images/hf-logo.svg" alt="Hugging Face" class="hugging-face-logo"/></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: UI
parent: Examples
nav_order: 8
description: overview of the major modules and classes of LLMWare  
permalink: /examples/ui
---
# UI - Introduction by Examples
We introduce ``llmware`` through self-contained examples.

**UI Scenarios**    

We provide several 'UI' examples that show how to use LLMWare in a complex recipe combining different elements to accomplish a specific objective.   While each example is still high-level, it is shared in the spirit of providing a high-level framework 'starting point' that can be developed in more detail for a variety of common use cases.  All of these examples use small, specialized models, running locally - 'Small, but Mighty' !  


1.  [**GGUF Streaming Chatbot**](https://www.github.com/llmware-ai/llmware/tree/main/examples/UI/gguf_streaming_chatbot.py)  

    - Locally deployed chatbot using leading open source chat models, including Phi-3-GGUF
    - Uses Streamlit
    - Core simple framework of ~20 lines using llmware and Streamlit

2.  [**Simple RAG UI with Streamlit**](https://www.github.com/llmware-ai/llmware/tree/main/examples/UI/simple_rag_ui_with_streamlit.py)  

    - Simple RAG UI 

3.  [**RAG UI with Query Topic with Streamlit**](https://www.github.com/llmware-ai/llmware/tree/main/examples/UI/rag_ui_with_query_topic_with_streamlit.py)  

    - UI demonstrating UI with query topic in RAG scenario

4.  [**Using Streamlit Chat UI**](https://www.github.com/llmware-ai/llmware/tree/main/examples/UI/using_streamlit_chat_ui.py)

    - Basic Streamlit Chat UI 


For more examples, see the [UI examples]((https://www.github.com/llmware-ai/llmware/tree/main/examples/UI/) in the main repo.   

Check back often - we are updating these examples regularly - and many of these examples have companion videos as well.  


# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://huggingface.co/llmware"><span><img src="assets/images/hf-logo.svg" alt="Hugging Face" class="hugging-face-logo"/></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Use Cases
parent: Examples
nav_order: 1
description: overview of the major modules and classes of LLMWare  
permalink: /examples/use_cases
---
🚀 Use Cases Examples  🚀  
---

**End-to-End Scenarios**    

We provide several 'end-to-end' examples that show how to use LLMWare in a complex recipe combining different elements to accomplish a specific objective.   While each example is still high-level, it is shared in the spirit of providing a high-level framework 'starting point' that can be developed in more detail for a variety of common use cases.  All of these examples use small, specialized models, running locally - 'Small, but Mighty' !  


1.  [**Research Automation with Agents and Web Services**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/web_services_slim_fx.py)  

    - Prepare a 30-key research analysis on a company  
    - Extract key lookup and other information from an earnings press release  
    - Automatically use the lookup data for real-time stock information from YFinance 
    - Automatically use the lookup date for background company history information in Wikipedia  
    - Run LLM prompts to ask key questions of the Wikipedia sources 
    - Aggregate into a consolidated research analysis
    - All with local open source models  


2.  [**Invoice Processing**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/invoice_processing.py)  

    - Parse a batch of invoices (provided as sample files)  
    - Extract key information from the invoices 
    - Save the prompt state for follow-up review and analysis 


3.  [**Analyzing and Extracting Voice Transcripts**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/parsing_great_speeches.py)  

    - Voice transcription of 50+ wav files of great speeches of the 20th century  
    - Run text queries against the transcribed wav files 
    - Execute LLM agent inferences to extract and identify key elements of interest 
    - Prepare 'bibliography' with the key extracted points, including time-stamp 


4.  [**MSA Processing**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/msa_processing.py)

    - Identify the termination provisions in Master Service Agreements among a larger batch of contracts  
    - Parse and query a large batch of contracts and identify the agreements with "Master Service Agreement" on the first page  
    - Find the termination provisions in each MSA  
    - Prompt LLM to read the termination provisions and answer a key question  
    - Run a fact-check and source-check on the LLM response
    - Save all of the responses in CSV and JSON for follow-up review.  


5.  [**Querying a CSV**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/agent_with_custom_tables.py) 

    - Start running natural language queries on CSVs with Postgres and slim-sql-tool.  
    - Load a sample 'customer_table.csv' into Postgres
    - Start running natural language queries that get converted into SQL and query the DB  
    

6.  [**Contract Analysis**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/contract_analysis_on_laptop_with_bling_models.py)  

    - Extract key information from set of employment agreement  
    - Use a simple retrieval strategy with keyword search to identify key provisions and topic areas  
    - Prompt LLM to read the key provisions and answer questions based on those source materials  

7.  [**Slicing and Dicing Office Docs**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/slicing_and_dicing_office_docs.py)  

    - Shows a variety of advanced parsing techniques with Office document formats packaged in ZIP archives  
    - Extracts tables and images, runs OCR against the embedded images, exports the whole library, and creates dataset  
    

For more examples, see the [use cases example]((https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/) in the main repo.   

Check back often - we are updating these examples regularly - and many of these examples have companion videos as well.  


# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://huggingface.co/llmware"><span>  <i class="fa-solid fa-face-smiling-hands"></i>
        </span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Clone Repo
parent: Getting Started
nav_order: 3
permalink: /getting_started/clone_repo
---

## ✍️ Working with the llmware Github repository

The llmware repo can be pulled locally to get access to all the examples, or to work directly with the latest version of the llmware code.  

```bash
git clone git@github.com:llmware-ai/llmware.git
```  

We have provided a **welcome_to_llmware** automation script in the root of the repository folder.  After cloning:  
- On Windows command line:  `.\welcome_to_llmware_windows.sh`  
- On Mac / Linux command line:  `sh ./welcome_to_llmware.sh`  

Alternatively, if you prefer to complete setup without the welcome automation script, then the next steps include:  

1.  **install requirements.txt** - inside the /llmware path - e.g., ```pip3 install -r llmware/requirements.txt```  

2.  **install requirements_extras.txt** - inside the /llmware path - e.g., ```pip3 install -r llmware/requirements_extras.txt```  (Depending upon your use case, you may not need all or any of these installs, but some of these will be used in the examples.)  

3.  **run examples** - copy one or more of the example .py files into the root project path.   (We have seen several IDEs that will attempt to run interactively from the nested /example path, and then not have access to the /llmware module - the easy fix is to just copy the example you want to run into the root path).  

4.  **install vector db** - no-install vector db options include milvus lite, chromadb, faiss and lancedb - which do not require a server install, but do require that you install the python sdk library for that vector db, e.g., `pip3 install pymilvus`, or `pip3 install chromadb`.  If you look in [examples/Embedding](https://github.com/llmware-ai/llmware/tree/main/examples/Embedding), you will see examples for getting started with various vector DB, and in the root of the repo, you will see easy-to-get-started docker compose scripts for installing milvus, postgres/pgvector, mongo, qdrant, neo4j, and redis.  

5.  Note:  we have seen recently issues with Pytorch==2.3 on some platforms - if you run into any issues, we have seen that uninstalling Pytorch and downleveling to Pytorch==2.1 usually solves the problem.  


# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://huggingface.co/llmware"><span><img src="assets/images/hf-logo.svg" alt="Hugging Face" class="hugging-face-logo"/></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Fast Start
parent: Getting Started
nav_order: 4
permalink: /getting_started/fast_start
---

Fast Start: Learning RAG with llmware through 6 examples 
---

**Welcome to llmware!**    

Fast Start is a structured series of 6 self-contained examples and accompanying videos that walk through the core foundational components of RAG with LLMWare.  
Set up  

`pip3 install llmware` or, if you prefer clone the github repo locally, e.g., `git clone git@github.com:llmware-ai/llmware.git
`.   

Platforms: 
- Mac M1/M2/M3, Windows, Linux (Ubuntu 20 or Ubuntu 22 preferred)  
- RAM: 16 GB minimum  
- Python 3.9, 3.10, 3.11, 3.12
- Pull the latest version of llmware == 0.2.14 (as of mid-May 2024)  
- Please note that we have updated the examples from the original versions, to use new features in llmware, so there may be minor differences with the videos, which are annotated in the comments in each example.    
  
There are 6 examples, designed to be used step-by-step, but each is self-contained,  
so you can feel free to jump into any of the examples, in any order, that you prefer.  

Each example has been designed to be "copy-paste" and RUN with lots of helpful comments and explanations embedded in the code samples.  

Please check out our [Fast Start Youtube tutorials](https://www.youtube.com/playlist?list=PL1-dn33KwsmD7SB9iSO6vx4ZLRAWea1DB) that walk through each example below.  

Examples:

**Section I - Learning the Main Components**
1.  **Library** - parse, text chunk, and index to convert a "pile of files" into an AI-ready knowledge-base.  [Video](https://youtu.be/2xDefZ4oBOM?si=8vRCvqj0-HG3zc4c)  
  
2.  **Embeddings** - apply an embedding model to the Library, store vectors, and start enabling natural language queries.  [Video](https://youtu.be/xQEk6ohvfV0?si=B3X25ZsAZfW4AR_3)
   
3.  **Prompts** & **Model Catalog** - start running inferences and building prompts.  [Video](https://youtu.be/swiu4oBVfbA?si=0IVmLhiiYS3-pMIg)

**Section II - Connecting Knowledge with Prompts - 3 scenarios**  

4.  **RAG with Text Query** - start integrating documents into prompts.  [Video](https://youtu.be/6oALi67HP7U?si=pAbvio4ULXTIXKdL)
  
5.  **RAG with Semantic Query** - use natural language queries on documents and integrate with prompts.  [Video](https://youtu.be/XT4kIXA9H3Q?si=EBCAxVXBt5vgYY8s)
    
6.  **RAG with more complex retrieval** - start integrating more complex retrieval patterns.  [Video](https://youtu.be/G1Q6Ar8THbo?si=vIVAv35uXAcnaUJy)  
   
After completing these 6 examples, you should have a good foundation and set of recipes to start 
exploring the other 100+ examples in the /examples folder, and build more sophisticated 
LLM-based applications.

**Models**  
  - All of these examples are optimized for using local CPU-based models, primarily BLING and DRAGON.
  - If you want to substitute for any other model in the catalog, it is generally as easy as 
    switching the model_name.  If the model requires API keys, we show in the examples how to pass those keys as an
    environment variable.

**Collection Databases**  
  - Our parsers are optimized to index text chunks directly into a persistent data store.   
  - For Fast Start, we will use "sqlite" which is an embedded database, requiring no install
  - For more scalable deployment, we would recommend either "mongo" or "postgres"
  - Install instructions for "mongo" and "postgres" are provided in docker-compose files in the repository

**Vector Databases**  
   - For Fast Start, we will use "chromadb" in persistent 'file' mode, requiring no install.  
   - Note: if you are using Python < 3.12, then please feel free to substitute for faiss (which was used in the videos).  
   - Note: depending upon how and when you installed llmware, you may need to `pip install chromadb`.  
   - For more scalable deployment, we would recommend installing one of 9 supported vector databases, 
     including Milvus, PGVector (Postgres), Redis, Qdrant, Neo4j, Mongo-Atlas, Chroma, LanceDB, or Pinecone.  
   - Install instructions provided in "examples/Embedding" for specific db, as well as docker-compose scripts.  

**Local Private**
    - All of the processing will take place locally on your laptop.

*This is an ongoing initiative to provide easy-to-get-started tutorials - we welcome and encourage feedback, as well
as contributions with examples and other tips for helping others on their LLM application journeys!*  

**Let's get started!**


# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://huggingface.co/llmware"><span><img src="assets/images/hf-logo.svg" alt="Hugging Face" class="hugging-face-logo"/></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Getting Started
nav_order: 2
has_children: true
description: getting started with llmware
permalink: /getting_started
---

## Welcome to  
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://llmware.ai/"><span><img src="assets/images/llmware_logo_color_cropped.png" alt="llmware" width="360" height="60"/></span></a>
    </li>
</ul>  

## 🧰🛠️🔩The Ultimate Toolkit for Enterprise RAG Pipelines with Small, Specialized Models   

From quickly building POCs to scalable LLM Apps for the enterprise, LLMWare is packed with all the tools you need. 

`llmware` is an integrated framework with over 50+ small, specialized, open source models for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows.  

This project provides a comprehensive set of tools that anyone can use - from a beginner to the most sophisticated AI developer - to rapidly build industrial-grade, knowledge-based enterprise LLM applications. 

Our specific focus is on making it easy to integrate open source small specialized models and connecting enterprise knowledge safely and securely. 


##  Getting Started 

1.  Install llmware - `pip3 install llmware`  


2.  Make sure that you are running on a [supported platform](https://github.com/llmware-ai/llmware/blob/main/docs/getting_started/platforms.md#platform-support).  


3.  Learn by example:  

    -- [Fast Start examples](https://www.github.com/llmware-ai/llmware/tree/main/fast_start) - structured set of 6 examples (with no DB installations required) to learn the main concepts of RAG with LLMWare - each example has extensive comments, and a supporting video on Youtube to walk you through it.    

    -- [Getting Started examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Getting_Started) - heavily-annotated examples that review many getting started elements - selecting a database, loading sample files, working with libraries, and how to use the Model Catalog.  

    -- [Use Case examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases) - longer examples that integrate several components of LLMWare to provide a framework for a solution for common use case patterns.  

    -- Dive into specific area of interest - [Parsing](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing) - [Models](https://www.github.com/llmware-ai/llmware/tree/main/examples/Models) - [Prompts](https://www.github.com/llmware-ai/llmware/tree/main/examples/Models) - [Agents](https://www.github.com/llmware-ai/llmware/tree/main/examples/SLIM-Agents)  - and many more ...


4.  We provide extensive [sample files](https://www.github.com/llmware-ai/tree/main/examples/Getting_Started/loading_sample_files.py) integrated into the examples, so you can copy-paste-run, and quickly validate that the installation is set up correctly, and to start seeing key classes and methods in action.  We would encourage you to start with the 'out of the box' example first, and then use the example as the launching point for inserting your documents, models, queries, and workflows.  


5.  Learn by watching: check out the [LLMWare Youtube channel](https://www.youtube.com/@llmware).  


6.  Share with the community:  join us on [Discord](https://discord.gg/MhZn5Nc39h).  


[Install llmware](#install-llmware){: .btn .btn-primary .fs-5 .mb-4 .mb-md-0 .mr-2 }  
[Common Setup & Configuration Items](#platform-support){: .btn .fs-5 .mb-4 .mb-md-0 }  
[Architecture](architecture.md/#llmware-architecture){: .btn .fs-5 .mb-4 .mb-md-0 }  
[View llmware on GitHub](https://www.github.com/llmware-ai/llmware/tree/main){: .btn .fs-5 .mb-4 .mb-md-0 }  
[Open an Issue on GitHub](https://www.github.com/llmware-ai/llmware/issues){: .btn .fs-5 .mb-4 .mb-md-0 }  


# Install llmware 

___  
**Using Pip Install**  

- Installing llmware is easy:  `pip3 install llmware` 


- If you prefer, we also provide a set of recent wheels in the [wheel archives](https://www.github.com/llmware-ai/llmware/tree/main/wheel_archives) in this repository, which can be downloaded individually and used as follows:  

```bash
pip3 install llmware-0.2.12-py3-none-any.wheel
````  

- We generally keep the main branch of this repository current with all changes, but we only publish new wheels to PyPi approximately once per week  

___

___
**Cloning the Repository**  

- If you prefer to clone the repository:  

```bash
git clone git@github.com:llmware-ai/llmware.git
```

- The llmware package is contained entirely in the /llmware folder path, so you should be able to drop this folder (with all of its contents) into a project tree, and use the llmware module essentially the same as a pip install.  

- Please ensure that you are capturing and updating the /llmware/lib folder, which includes required compiled shared libraries.  If you prefer, you can keep only those libs required for your OS platform.  

- After cloning the repo, we provide a short 'welcome to llmware' automation script, which can be used to install the projects requirements (from llmware/requirements.txt), install several optional dependencies that are commonly used in examples, copy several good 'getting started' examples into the root folder, and then run a 'welcome_example.py' script to get started using our models.  To use the "welcome to llmware" script:  

Windows:  
```bash
.\welcome_to_llmware_windows.sh
```

Mac/Linux:
```bash
sh ./welcome_to_llmware.sh
```

# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://huggingface.co/llmware"><span><img src="assets/images/hf-logo.svg" alt="Hugging Face" class="hugging-face-logo"/></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Installation
parent: Getting Started
nav_order: 2
permalink: /getting_started/installation
---

##  Installation

Set up  

`pip3 install llmware` or, if you prefer clone the github repo locally, e.g., `git clone git@github.com:llmware-ai/llmware.git
`.   

Platforms: 
- Mac M1/M2/M3, Windows, Linux (Ubuntu 20 or Ubuntu 22 preferred)  
- RAM: 16 GB minimum  
- Python 3.9, 3.10, 3.11 (note: not supported on 3.12 - coming soon!)  
- Pull the latest version of llmware == 0.2.11 (as of end of April 2024)  
- Please note that we have updated the examples from the original versions, to use new features in llmware, so there may be minor differences with the videos, which are annotated in the comments in each example.    
  

##  Wheel Archive  

- If you prefer, we also provide a set of recent wheels in the [wheel archives](https://www.github.com/llmware-ai/llmware/tree/main/wheel_archives) in this repository, which can be downloaded individually and used as follows:  

```bash
pip3 install llmware-0.2.12-py3-none-any.wheel
````  

- We generally keep the main branch of this repository current with all changes, but we only publish new wheels to PyPi approximately once per week  

___

___
**Cloning the Repository**  

- If you prefer to clone the repository:  

```bash
git clone git@github.com:llmware-ai/llmware.git
```

- The llmware package is contained entirely in the /llmware folder path, so you should be able to drop this folder (with all of its contents) into a project tree, and use the llmware module essentially the same as a pip install.  

- Please ensure that you are capturing and updating the /llmware/lib folder, which includes required compiled shared libraries.  If you prefer, you can keep only those libs required for your OS platform.  

- After cloning the repo, we provide a short 'welcome to llmware' automation script, which can be used to install the projects requirements (from llmware/requirements.txt), install several optional dependencies that are commonly used in examples, copy several good 'getting started' examples into the root folder, and then run a 'welcome_example.py' script to get started using our models.  To use the "welcome to llmware" script:  

Windows:  
```bash
.\welcome_to_llmware_windows.sh
```

Mac/Linux:
```bash
sh ./welcome_to_llmware.sh
```

# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://huggingface.co/llmware"><span><img src="assets/images/hf-logo.svg" alt="Hugging Face" class="hugging-face-logo"/></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Overview
parent: Getting Started
nav_order: 1
permalink: /getting_started/overview
---

## Welcome to  
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://llmware.ai/"><span><img src="assets/images/llmware_logo_color_cropped.png" alt="llmware" width="360" height="60"/></span></a>
    </li>
</ul>  

## 🧰🛠️🔩Building Enterprise RAG Pipelines with Small, Specialized Models  

`llmware` provides a unified framework for building LLM-based applications (e.g, RAG, Agents), using small, specialized models that can be deployed privately, integrated with enterprise knowledge sources safely and securely, and cost-effectively tuned and adapted for any business process.  

 `llmware` has two main components:  
 
 1.  **RAG Pipeline** - integrated components for the full lifecycle of connecting knowledge sources to generative AI models; and 

 2.  **50+ small, specialized models** fine-tuned for key tasks in enterprise process automation, including fact-based question-answering, classification, summarization, and extraction.  

By bringing together both of these components, along with integrating leading open source models and underlying technologies, `llmware` offers a comprehensive set of tools to rapidly build knowledge-based enterprise LLM applications.  

Most of our examples can be run without a GPU server - get started right away on your laptop.   

## 🎯  Key features 
Writing code with`llmware` is based on a few main concepts:

<details>
<summary><b>Model Catalog</b>: Access all models the same way with easy lookup, regardless of underlying implementation. 
</summary>  


```python
#   150+ Models in Catalog with 50+ RAG-optimized BLING, DRAGON and Industry BERT models
#   Full support for GGUF, HuggingFace, Sentence Transformers and major API-based models
#   Easy to extend to add custom models - see examples

from llmware.models import ModelCatalog
from llmware.prompts import Prompt

#   all models accessed through the ModelCatalog
models = ModelCatalog().list_all_models()

#   to use any model in the ModelCatalog - "load_model" method and pass the model_name parameter
my_model = ModelCatalog().load_model("llmware/bling-phi-3-gguf")
output = my_model.inference("what is the future of AI?", add_context="Here is the article to read")

#   to integrate model into a Prompt
prompter = Prompt().load_model("llmware/bling-tiny-llama-v0")
response = prompter.prompt_main("what is the future of AI?", context="Insert Sources of information")
```

</details>  

<details>  
<summary><b>Library</b>:  ingest, organize and index a collection of knowledge at scale - Parse, Text Chunk and Embed. </summary>  

```python

from llmware.library import Library

#   to parse and text chunk a set of documents (pdf, pptx, docx, xlsx, txt, csv, md, json/jsonl, wav, png, jpg, html)  

#   step 1 - create a library, which is the 'knowledge-base container' construct
#          - libraries have both text collection (DB) resources, and file resources (e.g., llmware_data/accounts/{library_name})
#          - embeddings and queries are run against a library

lib = Library().create_new_library("my_library")

#    step 2 - add_files is the universal ingestion function - point it at a local file folder with mixed file types
#           - files will be routed by file extension to the correct parser, parsed, text chunked and indexed in text collection DB

lib.add_files("/folder/path/to/my/files")

#   to install an embedding on a library - pick an embedding model and vector_db
lib.install_new_embedding(embedding_model_name="mini-lm-sbert", vector_db="milvus", batch_size=500)

#   to add a second embedding to the same library (mix-and-match models + vector db)  
lib.install_new_embedding(embedding_model_name="industry-bert-sec", vector_db="chromadb", batch_size=100)

#   easy to create multiple libraries for different projects and groups

finance_lib = Library().create_new_library("finance_q4_2023")
finance_lib.add_files("/finance_folder/")

hr_lib = Library().create_new_library("hr_policies")
hr_lib.add_files("/hr_folder/")

#    pull library card with key metadata - documents, text chunks, images, tables, embedding record
lib_card = Library().get_library_card("my_library")

#   see all libraries
all_my_libs = Library().get_all_library_cards()

```
</details>  

<details> 
<summary><b>Query</b>: query libraries with mix of text, semantic, hybrid, metadata, and custom filters. </summary>

```python

from llmware.retrieval import Query
from llmware.library import Library

#   step 1 - load the previously created library 
lib = Library().load_library("my_library")

#   step 2 - create a query object and pass the library
q = Query(lib)

#    step 3 - run lots of different queries  (many other options in the examples)

#    basic text query
results1 = q.text_query("text query", result_count=20, exact_mode=False)

#    semantic query
results2 = q.semantic_query("semantic query", result_count=10)

#    combining a text query restricted to only certain documents in the library and "exact" match to the query
results3 = q.text_query_with_document_filter("new query", {"file_name": "selected file name"}, exact_mode=True)

#   to apply a specific embedding (if multiple on library), pass the names when creating the query object
q2 = Query(lib, embedding_model_name="mini_lm_sbert", vector_db="milvus")
results4 = q2.semantic_query("new semantic query")
```

</details>  

<details>
<summary><b>Prompt with Sources</b>: the easiest way to combine knowledge retrieval with a LLM inference. </summary>

```python

from llmware.prompts import Prompt
from llmware.retrieval import Query
from llmware.library import Library

#   build a prompt
prompter = Prompt().load_model("llmware/bling-tiny-llama-v0")

#   add a file -> file is parsed, text chunked, filtered by query, and then packaged as model-ready context,
#   including in batches, if needed, to fit the model context window

source = prompter.add_source_document("/folder/to/one/doc/", "filename", query="fast query")

#   attach query results (from a Query) into a Prompt
my_lib = Library().load_library("my_library")
results = Query(my_lib).query("my query")
source2 = prompter.add_source_query_results(results)

#   run a new query against a library and load directly into a prompt
source3 = prompter.add_source_new_query(my_lib, query="my new query", query_type="semantic", result_count=15)

#   to run inference with 'prompt with sources'
responses = prompter.prompt_with_source("my query")

#   to run fact-checks - post inference
fact_check = prompter.evidence_check_sources(responses)

#   to view source materials (batched 'model-ready' and attached to prompt)
source_materials = prompter.review_sources_summary()

#   to see the full prompt history
prompt_history = prompter.get_current_history()
```

</details>  

<details> 
<summary><b>RAG-Optimized Models</b> -  1-7B parameter models designed for RAG workflow integration and running locally. </summary>  

```
""" This 'Hello World' example demonstrates how to get started using local BLING models with provided context, using both
Pytorch and GGUF versions. """

import time
from llmware.prompts import Prompt


def hello_world_questions():

    test_list = [

    {"query": "What is the total amount of the invoice?",
     "answer": "$22,500.00",
     "context": "Services Vendor Inc. \n100 Elm Street Pleasantville, NY \nTO Alpha Inc. 5900 1st Street "
                "Los Angeles, CA \nDescription Front End Engineering Service $5000.00 \n Back End Engineering"
                " Service $7500.00 \n Quality Assurance Manager $10,000.00 \n Total Amount $22,500.00 \n"
                "Make all checks payable to Services Vendor Inc. Payment is due within 30 days."
                "If you have any questions concerning this invoice, contact Bia Hermes. "
                "THANK YOU FOR YOUR BUSINESS!  INVOICE INVOICE # 0001 DATE 01/01/2022 FOR Alpha Project P.O. # 1000"},

    {"query": "What was the amount of the trade surplus?",
     "answer": "62.4 billion yen ($416.6 million)",
     "context": "Japan’s September trade balance swings into surplus, surprising expectations"
                "Japan recorded a trade surplus of 62.4 billion yen ($416.6 million) for September, "
                "beating expectations from economists polled by Reuters for a trade deficit of 42.5 "
                "billion yen. Data from Japan’s customs agency revealed that exports in September "
                "increased 4.3% year on year, while imports slid 16.3% compared to the same period "
                "last year. According to FactSet, exports to Asia fell for the ninth straight month, "
                "which reflected ongoing China weakness. Exports were supported by shipments to "
                "Western markets, FactSet added. — Lim Hui Jie"},

    {"query": "When did the LISP machine market collapse?",
     "answer": "1987.",
     "context": "The attendees became the leaders of AI research in the 1960s."
                "  They and their students produced programs that the press described as 'astonishing': "
                "computers were learning checkers strategies, solving word problems in algebra, "
                "proving logical theorems and speaking English.  By the middle of the 1960s, research in "
                "the U.S. was heavily funded by the Department of Defense and laboratories had been "
                "established around the world. Herbert Simon predicted, 'machines will be capable, "
                "within twenty years, of doing any work a man can do'.  Marvin Minsky agreed, writing, "
                "'within a generation ... the problem of creating 'artificial intelligence' will "
                "substantially be solved'. They had, however, underestimated the difficulty of the problem.  "
                "Both the U.S. and British governments cut off exploratory research in response "
                "to the criticism of Sir James Lighthill and ongoing pressure from the US Congress "
                "to fund more productive projects. Minsky's and Papert's book Perceptrons was understood "
                "as proving that artificial neural networks approach would never be useful for solving "
                "real-world tasks, thus discrediting the approach altogether.  The 'AI winter', a period "
                "when obtaining funding for AI projects was difficult, followed.  In the early 1980s, "
                "AI research was revived by the commercial success of expert systems, a form of AI "
                "program that simulated the knowledge and analytical skills of human experts. By 1985, "
                "the market for AI had reached over a billion dollars. At the same time, Japan's fifth "
                "generation computer project inspired the U.S. and British governments to restore funding "
                "for academic research. However, beginning with the collapse of the Lisp Machine market "
                "in 1987, AI once again fell into disrepute, and a second, longer-lasting winter began."},

    {"query": "What is the current rate on 10-year treasuries?",
     "answer": "4.58%",
     "context": "Stocks rallied Friday even after the release of stronger-than-expected U.S. jobs data "
                "and a major increase in Treasury yields.  The Dow Jones Industrial Average gained 195.12 points, "
                "or 0.76%, to close at 31,419.58. The S&P 500 added 1.59% at 4,008.50. The tech-heavy "
                "Nasdaq Composite rose 1.35%, closing at 12,299.68. The U.S. economy added 438,000 jobs in "
                "August, the Labor Department said. Economists polled by Dow Jones expected 273,000 "
                "jobs. However, wages rose less than expected last month.  Stocks posted a stunning "
                "turnaround on Friday, after initially falling on the stronger-than-expected jobs report. "
                "At its session low, the Dow had fallen as much as 198 points; it surged by more than "
                "500 points at the height of the rally. The Nasdaq and the S&P 500 slid by 0.8% during "
                "their lowest points in the day.  Traders were unclear of the reason for the intraday "
                "reversal. Some noted it could be the softer wage number in the jobs report that made "
                "investors rethink their earlier bearish stance. Others noted the pullback in yields from "
                "the day’s highs. Part of the rally may just be to do a market that had gotten extremely "
                "oversold with the S&P 500 at one point this week down more than 9% from its high earlier "
                "this year.  Yields initially surged after the report, with the 10-year Treasury rate trading "
                "near its highest level in 14 years. The benchmark rate later eased from those levels, but "
                "was still up around 6 basis points at 4.58%.  'We’re seeing a little bit of a give back "
                "in yields from where we were around 4.8%. [With] them pulling back a bit, I think that’s "
                "helping the stock market,' said Margaret Jones, chief investment officer at Vibrant Industries "
                "Capital Advisors. 'We’ve had a lot of weakness in the market in recent weeks, and potentially "
                "some oversold conditions.'"},

    {"query": "Is the expected gross margin greater than 70%?",
     "answer": "Yes, between 71.5% and 72.%",
     "context": "Outlook NVIDIA’s outlook for the third quarter of fiscal 2024 is as follows:"
                "Revenue is expected to be $16.00 billion, plus or minus 2%. GAAP and non-GAAP "
                "gross margins are expected to be 71.5% and 72.5%, respectively, plus or minus "
                "50 basis points.  GAAP and non-GAAP operating expenses are expected to be "
                "approximately $2.95 billion and $2.00 billion, respectively.  GAAP and non-GAAP "
                "other income and expense are expected to be an income of approximately $100 "
                "million, excluding gains and losses from non-affiliated investments. GAAP and "
                "non-GAAP tax rates are expected to be 14.5%, plus or minus 1%, excluding any discrete items."
                "Highlights NVIDIA achieved progress since its previous earnings announcement "
                "in these areas:  Data Center Second-quarter revenue was a record $10.32 billion, "
                "up 141% from the previous quarter and up 171% from a year ago. Announced that the "
                "NVIDIA® GH200 Grace™ Hopper™ Superchip for complex AI and HPC workloads is shipping "
                "this quarter, with a second-generation version with HBM3e memory expected to ship "
                "in Q2 of calendar 2024. "},

    {"query": "What is Bank of America's rating on Target?",
     "answer": "Buy",
     "context": "Here are some of the tickers on my radar for Thursday, Oct. 12, taken directly from "
                "my reporter’s notebook: It’s the one-year anniversary of the S&P 500′s bear market bottom "
                "of 3,577. Since then, as of Wednesday’s close of 4,376, the broad market index "
                "soared more than 22%.  Hotter than expected September consumer price index, consumer "
                "inflation. The Social Security Administration issues announced a 3.2% cost-of-living "
                "adjustment for 2024.  Chipotle Mexican Grill (CMG) plans price increases. Pricing power. "
                "Cites consumer price index showing sticky retail inflation for the fourth time "
                "in two years. Bank of America upgrades Target (TGT) to buy from neutral. Cites "
                "risk/reward from depressed levels. Traffic could improve. Gross margin upside. "
                "Merchandising better. Freight and transportation better. Target to report quarter "
                "next month. In retail, the CNBC Investing Club portfolio owns TJX Companies (TJX), "
                "the off-price juggernaut behind T.J. Maxx, Marshalls and HomeGoods. Goldman Sachs "
                "tactical buy trades on Club names Wells Fargo (WFC), which reports quarter Friday, "
                "Humana (HUM) and Nvidia (NVDA). BofA initiates Snowflake (SNOW) with a buy rating."
                "If you like this story, sign up for Jim Cramer’s Top 10 Morning Thoughts on the "
                "Market email newsletter for free. Barclays cuts price targets on consumer products: "
                "UTZ Brands (UTZ) to $16 per share from $17. Kraft Heinz (KHC) to $36 per share from "
                "$38. Cyclical drag. J.M. Smucker (SJM) to $129 from $160. Secular headwinds. "
                "Coca-Cola (KO) to $59 from $70. Barclays cut PTs on housing-related stocks: Toll Brothers"
                "(TOL) to $74 per share from $82. Keeps underweight. Lowers Trex (TREX) and Azek"
                "(AZEK), too. Goldman Sachs (GS) announces sale of fintech platform and warns on "
                "third quarter of 19-cent per share drag on earnings. The buyer: investors led by "
                "private equity firm Sixth Street. Exiting a mistake. Rise in consumer engagement for "
                "Spotify (SPOT), says Morgan Stanley. The analysts hike price target to $190 per share "
                "from $185. Keeps overweight (buy) rating. JPMorgan loves elf Beauty (ELF). Keeps "
                "overweight (buy) rating but lowers price target to $139 per share from $150. "
                "Sees “still challenging” environment into third-quarter print. The Club owns shares "
                "in high-end beauty company Estee Lauder (EL). Barclays upgrades First Solar (FSLR) "
                "to overweight from equal weight (buy from hold) but lowers price target to $224 per "
                "share from $230. Risk reward upgrade. Best visibility of utility scale names."},

    {"query": "What was the rate of decline in 3rd quarter sales?",
     "answer": "20% year-on-year.",
     "context": "Nokia said it would cut up to 14,000 jobs as part of a cost cutting plan following "
                "third quarter earnings that plunged. The Finnish telecommunications giant said that "
                "it will reduce its cost base and increase operation efficiency to “address the "
                "challenging market environment. The substantial layoffs come after Nokia reported "
                "third-quarter net sales declined 20% year-on-year to 4.98 billion euros. Profit over "
                "the period plunged by 69% year-on-year to 133 million euros."},

    {"query": "What is a list of the key points?",
     "answer": "•Stocks rallied on Friday with stronger-than-expected U.S jobs data and increase in "
               "Treasury yields;\n•Dow Jones gained 195.12 points;\n•S&P 500 added 1.59%;\n•Nasdaq Composite rose "
               "1.35%;\n•U.S. economy added 438,000 jobs in August, better than the 273,000 expected;\n"
               "•10-year Treasury rate trading near the highest level in 14 years at 4.58%.",
     "context": "Stocks rallied Friday even after the release of stronger-than-expected U.S. jobs data "
               "and a major increase in Treasury yields.  The Dow Jones Industrial Average gained 195.12 points, "
               "or 0.76%, to close at 31,419.58. The S&P 500 added 1.59% at 4,008.50. The tech-heavy "
               "Nasdaq Composite rose 1.35%, closing at 12,299.68. The U.S. economy added 438,000 jobs in "
               "August, the Labor Department said. Economists polled by Dow Jones expected 273,000 "
               "jobs. However, wages rose less than expected last month.  Stocks posted a stunning "
               "turnaround on Friday, after initially falling on the stronger-than-expected jobs report. "
               "At its session low, the Dow had fallen as much as 198 points; it surged by more than "
               "500 points at the height of the rally. The Nasdaq and the S&P 500 slid by 0.8% during "
               "their lowest points in the day.  Traders were unclear of the reason for the intraday "
               "reversal. Some noted it could be the softer wage number in the jobs report that made "
               "investors rethink their earlier bearish stance. Others noted the pullback in yields from "
               "the day’s highs. Part of the rally may just be to do a market that had gotten extremely "
               "oversold with the S&P 500 at one point this week down more than 9% from its high earlier "
               "this year.  Yields initially surged after the report, with the 10-year Treasury rate trading "
               "near its highest level in 14 years. The benchmark rate later eased from those levels, but "
               "was still up around 6 basis points at 4.58%.  'We’re seeing a little bit of a give back "
               "in yields from where we were around 4.8%. [With] them pulling back a bit, I think that’s "
               "helping the stock market,' said Margaret Jones, chief investment officer at Vibrant Industries "
               "Capital Advisors. 'We’ve had a lot of weakness in the market in recent weeks, and potentially "
               "some oversold conditions.'"}

    ]

    return test_list


# this is the main script to be run

def bling_meets_llmware_hello_world (model_name):

    t0 = time.time()

    # load the questions
    test_list = hello_world_questions()

    print(f"\n > Loading Model: {model_name}...")

    # load the model 
    prompter = Prompt().load_model(model_name)

    t1 = time.time()
    print(f"\n > Model {model_name} load time: {t1-t0} seconds")
 
    for i, entries in enumerate(test_list):

        print(f"\n{i+1}. Query: {entries['query']}")
     
        # run the prompt
        output = prompter.prompt_main(entries["query"],context=entries["context"]
                                      , prompt_name="default_with_context",temperature=0.30)

        # print out the results
        llm_response = output["llm_response"].strip("\n")
        print(f"LLM Response: {llm_response}")
        print(f"Gold Answer: {entries['answer']}")
        print(f"LLM Usage: {output['usage']}")

    t2 = time.time()

    print(f"\nTotal processing time: {t2-t1} seconds")

    return 0


if __name__ == "__main__":

    # list of 'rag-instruct' laptop-ready small bling models on HuggingFace

    pytorch_models = ["llmware/bling-1b-0.1",                    #  most popular
                      "llmware/bling-tiny-llama-v0",             #  fastest 
                      "llmware/bling-1.4b-0.1",
                      "llmware/bling-falcon-1b-0.1",
                      "llmware/bling-cerebras-1.3b-0.1",
                      "llmware/bling-sheared-llama-1.3b-0.1",    
                      "llmware/bling-sheared-llama-2.7b-0.1",
                      "llmware/bling-red-pajamas-3b-0.1",
                      "llmware/bling-stable-lm-3b-4e1t-v0",
                      "llmware/bling-phi-3"                      # most accurate (and newest)  
                      ]

    #  Quantized GGUF versions generally load faster and run nicely on a laptop with at least 16 GB of RAM
    gguf_models = ["bling-phi-3-gguf", "bling-stablelm-3b-tool", "dragon-llama-answer-tool", "dragon-yi-answer-tool", "dragon-mistral-answer-tool"]

    #   try model from either pytorch or gguf model list
    #   the newest (and most accurate) is 'bling-phi-3-gguf'  

    bling_meets_llmware_hello_world(gguf_models[0]  

    #   check out the model card on Huggingface for RAG benchmark test performance results and other useful information
```

</details>

<details>
<summary><b>Simple-to-Scale Database Options </b> - integrated data stores from laptop to parallelized cluster. </summary>

```python

from llmware.configs import LLMWareConfig

#   to set the collection database - mongo, sqlite, postgres  
LLMWareConfig().set_active_db("mongo")  

#   to set the vector database (or declare when installing)  
#   --options: milvus, pg_vector (postgres), redis, qdrant, faiss, pinecone, mongo atlas  
LLMWareConfig().set_vector_db("milvus")  

#   for fast start - no installations required  
LLMWareConfig().set_active_db("sqlite")  
LLMWareConfig().set_vector_db("chromadb")   # try also faiss and lancedb  

#   for single postgres deployment  
LLMWareConfig().set_active_db("postgres")  
LLMWareConfig().set_vector_db("postgres")  

#   to install mongo, milvus, postgres - see the docker-compose scripts as well as examples

```

</details>

<details>

<summary> 🔥 <b> Agents with Function Calls and SLIM Models </b> 🔥 </summary>  

```python

from llmware.agents import LLMfx

text = ("Tesla stock fell 8% in premarket trading after reporting fourth-quarter revenue and profit that "
        "missed analysts’ estimates. The electric vehicle company also warned that vehicle volume growth in "
        "2024 'may be notably lower' than last year’s growth rate. Automotive revenue, meanwhile, increased "
        "just 1% from a year earlier, partly because the EVs were selling for less than they had in the past. "
        "Tesla implemented steep price cuts in the second half of the year around the world. In a Wednesday "
        "presentation, the company warned investors that it’s 'currently between two major growth waves.'")

#   create an agent using LLMfx class
agent = LLMfx()

#   load text to process
agent.load_work(text)

#   load 'models' as 'tools' to be used in analysis process
agent.load_tool("sentiment")
agent.load_tool("extract")
agent.load_tool("topics")
agent.load_tool("boolean")

#   run function calls using different tools
agent.sentiment()
agent.topics()
agent.extract(params=["company"])
agent.extract(params=["automotive revenue growth"])
agent.xsum()
agent.boolean(params=["is 2024 growth expected to be strong? (explain)"])

#   at end of processing, show the report that was automatically aggregated by key
report = agent.show_report()

#   displays a summary of the activity in the process
activity_summary = agent.activity_summary()

#   list of the responses gathered
for i, entries in enumerate(agent.response_list):
    print("update: response analysis: ", i, entries)

output = {"report": report, "activity_summary": activity_summary, "journal": agent.journal}  

```

</details>
<details>

<summary> 🚀 <b>Start coding - Quick Start for RAG </b> 🚀 </summary>

```python
# This example illustrates a simple contract analysis
# using a RAG-optimized LLM running locally

import os
import re
from llmware.prompts import Prompt, HumanInTheLoop
from llmware.setup import Setup
from llmware.configs import LLMWareConfig

def contract_analysis_on_laptop (model_name):

    #  In this scenario, we will:
    #  -- download a set of sample contract files
    #  -- create a Prompt and load a BLING LLM model
    #  -- parse each contract, extract the relevant passages, and pass questions to a local LLM

    #  Main loop - Iterate thru each contract:
    #
    #      1.  parse the document in memory (convert from PDF file into text chunks with metadata)
    #      2.  filter the parsed text chunks with a "topic" (e.g., "governing law") to extract relevant passages
    #      3.  package and assemble the text chunks into a model-ready context
    #      4.  ask three key questions for each contract to the LLM
    #      5.  print to the screen
    #      6.  save the results in both json and csv for furthe processing and review.

    #  Load the llmware sample files

    print (f"\n > Loading the llmware sample files...")

    sample_files_path = Setup().load_sample_files()
    contracts_path = os.path.join(sample_files_path,"Agreements")
 
    #  Query list - these are the 3 main topics and questions that we would like the LLM to analyze for each contract

    query_list = {"executive employment agreement": "What are the name of the two parties?",
                  "base salary": "What is the executive's base salary?",
                  "vacation": "How many vacation days will the executive receive?"}

    #  Load the selected model by name that was passed into the function

    print (f"\n > Loading model {model_name}...")

    prompter = Prompt().load_model(model_name, temperature=0.0, sample=False)

    #  Main loop

    for i, contract in enumerate(os.listdir(contracts_path)):

        #   excluding Mac file artifact (annoying, but fact of life in demos)
        if contract != ".DS_Store":

            print("\nAnalyzing contract: ", str(i+1), contract)

            print("LLM Responses:")

            for key, value in query_list.items():

                # step 1 + 2 + 3 above - contract is parsed, text-chunked, filtered by topic key,
                # ... and then packaged into the prompt

                source = prompter.add_source_document(contracts_path, contract, query=key)

                # step 4 above - calling the LLM with 'source' information already packaged into the prompt

                responses = prompter.prompt_with_source(value, prompt_name="default_with_context")  

                # step 5 above - print out to screen

                for r, response in enumerate(responses):
                    print(key, ":", re.sub("[\n]"," ", response["llm_response"]).strip())

                # We're done with this contract, clear the source from the prompt
                prompter.clear_source_materials()

    # step 6 above - saving the analysis to jsonl and csv

    # Save jsonl report to jsonl to /prompt_history folder
    print("\nPrompt state saved at: ", os.path.join(LLMWareConfig.get_prompt_path(),prompter.prompt_id))
    prompter.save_state()

    # Save csv report that includes the model, response, prompt, and evidence for human-in-the-loop review
    csv_output = HumanInTheLoop(prompter).export_current_interaction_to_csv()
    print("csv output saved at:  ", csv_output)


if __name__ == "__main__":

    # use local cpu model - try the newest - RAG finetune of Phi-3 quantized and packaged in GGUF  
    model = "bling-phi-3-gguf"

    contract_analysis_on_laptop(model)

```
</details>


# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://huggingface.co/llmware"><span><img src="assets/images/hf-logo.svg" alt="Hugging Face" class="hugging-face-logo"/></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Platforms Supported
parent: Getting Started
nav_order: 5
permalink: /getting_started/platforms
---
___  
# Platform Support
___

**Platform Supported**

- **Python 3.9+**  (note that we just added support for 3.12 starting in llmware version 0.2.12)  


- **System RAM**:  recommended 16 GB RAM minimum (to run most local models on CPU)  


- **OS Supported**:  Mac OS M1/M2/M3, Windows, Linux Ubuntu 20/22.  We regularly build and test on Windows and Linux platforms with and without CUDA drivers.


- **Deprecated OS**:  Linux Aarch64 (0.2.6) and Mac x86 (0.2.10) - most features of llmware should work on these platforms, but new features integrated since those versions will not be available.  If you have a particular need to work on one of these platforms, please raise an Issue, and we can work with you to try to find a solution.  


- **Linux**:  we build to GLIBC 2.31+ - so Linux versions with older GLIBC drivers will generally not work (e.g., Ubuntu 18).  To check the GLIBC version, you can use the command `ldd --version`.  If it is 2.31 or any higher version, it should work.  

___

___
**Database**  

- LLMWare is an enterprise-grade data pipeline designed for persistent storage of key artifacts throughout the pipeline.  We provide several options to parse 'in-memory' and write to jsonl files, but most of the functionality of LLMWare assumes that a persistent scalable data store will be used.   


- There are three different types of data storage used in LLMWare:

    1.  **Text Collection database** - all of the LLMWare parsers, by default, parse and text chunk unstructured content (and associated metadata) into one of three databases used for text collections, organized in Libraries - **MongoDB**, **Postgres** and **SQLite**.  

    2.  **Vector database** - for storing and retrieving semantic embedding vectors, LLMWare supports the following vector databases - Milvus, PG Vector / Postgres, Qdrant, ChromaDB, Redis, Neo4J, Lance DB, Mongo-Atlas, Pinecone and FAISS.  
  
    3.  **SQL Tables database** - for easily integrating table-based data into LLM workflows through the CustomTable class and for using in conjunction with a Text-2-SQL workflow - supported on Postgres and SQLite.  


- **Fast Start** option:  you can start using SQLite locally without any separate installation by setting `LLMWareConfig.set_active_db("sqlite")` as shown in [configure_db_example](https://www.github.com/llmware-ai/llmware/blob/main/examples/Getting_Started/configure_db.py).  For vector embedding examples, you can use ChromaDB, LanceDB or FAISS - all of which provide no-install options - just start using.  


- **Install DB dependencies**:  we provide a number of Docker-Compose scripts which can be used, or follow install instructions provided by the database - generally easiest to install locally with Docker.  


**LLMWare File Storage**

- llmware stores a variety of artifacts during its operation locally in the /llmware_data path, which can be found as follows:  

```python
from llmware.configs import LLMWareConfig
llmware_fp = LLMWareConfig().get_llmware_path()
print("llmware_data path: ", llmware_fp)
```

- to change the llmware path, we can change both the 'home' path, which is the main filepath, and the 'llmware_data' path name 
as follows:  

```python

from llmware.configs import LLMWareConfig

# changing the llmware home path - change home + llmware_path_name
LLMWareConfig().set_home("/my/new/local/home/path")
LLMWareConfig().set_llmware_path_name("llmware_data2")

# check the new llmware home path
llmware_fp = LLMWareConfig().get_llmware_path()
print("updated llmware path: ", llmware_fp)

```

___

___
**Local Models**

- LLMWare treats open source and locally deployed models as "first class citizens" with all classes, methods and examples designed to work first with smaller, specialized, locally-deployed models.  
- By default, most models are pulled from public HuggingFace repositories, and cached locally.  LLMWare will store all models locally at the /llmware_data/model_repo path, with all assets found in a folder tree with the models name.  
- If a Pytorch model is pulled from HuggingFace, then it will appear in the default HuggingFace /.cache path.   
- To view the local model path:  

```python
from llmware.configs import LLMWareConfig

model_fp = LLMWareConfig().get_model_repo_path()
print("model repo path: ", model_fp)

```


# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://huggingface.co/llmware"><span><img src="assets/images/hf-logo.svg" alt="Hugging Face" class="hugging-face-logo"/></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Working with Docker
parent: Getting Started
nav_order: 6
permalink: /getting_started/working_with_docker
---

# Working with Docker Scripts 

This section is a short guide on setting up a Linux environment with Docker and running LLMWare examples with different database systems.

## 1. Python and Pip
Python should come installed with your Linux environment.

To install Pip, run the following:
```
sudo apt-get update
sudo apt-get -y install python3-pip
pip3 install --upgrade pip
```

## 2. Docker and Docker Compose
The latest versions of Docker and Docker Comopse should be installed to be able to use the Docker Compose files in the LLMWare repository.

Instructions to install Docker: https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-on-ubuntu-20-04 (Steps 1-2)
Note: Step 1 is necessary, Step 2 is optional but we highly recommend it.

Instructions to install Docker Compose: https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-compose-on-ubuntu-20-04 (Step 1)
Note: replace the URL in the `curl` command with the latest download from https://github.com/docker/compose/releases.

Check that Docker is running on your system:
```
sudo systemctl status docker
```

## 3. Running Docker Compose files
`cd` into the repository and ensure that you can see files of the format `docker-compose-database-name.yaml`.

To run a Compose file:
```
docker-compose -f docker-compose-database-name.yaml up -d
```

Check that the container is running:
```
docker ps
```
Note: this will list only the the containers that are currently running, add the `-a` flag (`docker ps -a`) to list all containers (even those that are stopped).

## 4. Test with Examples
The Compose files currently support 6 database systems:
- Mongo
- Postgres/PG Vector
- Neo4j
- Milvus
- Qdrant
- Redis

Note: PG Vector is an alias for Postgres and is used for vector embeddings.

1. Mongo and Postgres are used as the active database to store library text collections.
2. PG Vector, Neo4j, Milvus, Qdrant and Redis are used as the vector database to store vector embeddings.

To test that the containers are working as intended, you can modify an example provided in the LLMWare repository. The simplest example to do this is `fast_start/example-2-build_embeddings.py`.

Open the file in an editor.
1. Change the argument passed in as the active database on line 128 to an appropriate active database (Mongo or Postgres).
2. Change the argument passed in as the vector database on line 138 to an appropriate vector database (PG Vector, Neo4j, Milvus, Qdrant or Redis).

Run the example with these changes, and you should see updates in the terminal indicating that the embeddings are being generated correctly.

Note: It is possible that you will see an error:
```
llmware.exceptions.EmbeddingModelNotFoundException: Embedding model for 'example2_library' could not be located
```
In this case, use a unique name for the library name passed in on line 147.

## 5. Stopping/Deleting Containers
To stop a container, run:
```
docker stop container_ID_OR_container_name
```

To delete a container, run:
```
docker rm container_ID_OR_container_name
```

Note: passing in either the ID or the name will work.

To find the ID or name of a container, run:
```
docker ps -a
```

---

# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://huggingface.co/llmware"><span><img src="assets/images/hf-logo.svg" alt="Hugging Face" class="hugging-face-logo"/></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Home | llmware
nav_order: 1
description: llmware is an integrated framework with over 50+ models for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows.
permalink: /
---
## Welcome to  
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://llmware.ai/"><span><img src="assets/images/llmware_logo_color_cropped.png" alt="llmware" width="360" height="60"/></span></a>
    </li>
</ul>  

## 🧰🛠️🔩The Ultimate Toolkit for Enterprise RAG Pipelines with Small, Specialized Models   

From quickly building POCs to scalable LLM Apps for the enterprise, LLMWare is packed with all the tools you need. 

`llmware` is an integrated framework with over 50+ small, specialized, open source models for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows.  

This project provides a comprehensive set of tools that anyone can use - from a beginner to the most sophisticated AI developer - to rapidly build industrial-grade, knowledge-based enterprise LLM applications. 

Our specific focus is on making it easy to integrate open source small specialized models and connecting enterprise knowledge safely and securely. 


##  Getting Started 

1.  Install llmware - `pip3 install llmware`  


2.  Make sure that you are running on a [supported platform](https://www.github.com/llmware-ai/llmware/tree/main/docs/getting_started/platforms.md#platform-support).  


3.  Learn by example:  

    -- [Fast Start examples](https://www.github.com/llmware-ai/llmware/tree/main/fast_start) - structured set of 6 examples (with no DB installations required) to learn the main concepts of RAG with LLMWare - each example has extensive comments, and a supporting video on Youtube to walk you through it.    

    -- [Getting Started examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Getting_Started) - heavily-annotated examples that review many getting started elements - selecting a database, loading sample files, working with libraries, and how to use the Model Catalog.  

    -- [Use Case examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases) - longer examples that integrate several components of LLMWare to provide a framework for a solution for common use case patterns.  

    -- Dive into specific area of interest - [Parsing](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing) - [Models](https://www.github.com/llmware-ai/llmware/tree/main/examples/Models) - [Prompts](https://www.github.com/llmware-ai/llmware/tree/main/examples/Models) - [Agents](https://www.github.com/llmware-ai/llmware/tree/main/examples/SLIM-Agents)  - and many more ...


4.  We provide extensive [sample files](https://www.github.com/llmware-ai/tree/main/examples/Getting_Started/loading_sample_files.py) integrated into the examples, so you can copy-paste-run, and quickly validate that the installation is set up correctly, and to start seeing key classes and methods in action.  We would encourage you to start with the 'out of the box' example first, and then use the example as the launching point for inserting your documents, models, queries, and workflows.  


5.  Learn by watching: check out the [LLMWare Youtube channel](https://www.youtube.com/@llmware).  


6.  Share with the community:  join us on [Discord](https://discord.gg/MhZn5Nc39h).  


[Install llmware](#install-llmware){: .btn .btn-primary .fs-5 .mb-4 .mb-md-0 .mr-2 }  
[Common Setup & Configuration Items](#platform-support){: .btn .fs-5 .mb-4 .mb-md-0 }  
[Architecture](architecture.md/#llmware-architecture){: .btn .fs-5 .mb-4 .mb-md-0 }  
[View llmware on GitHub](https://www.github.com/llmware-ai/llmware/tree/main){: .btn .fs-5 .mb-4 .mb-md-0 }  
[Open an Issue on GitHub](https://www.github.com/llmware-ai/llmware/issues){: .btn .fs-5 .mb-4 .mb-md-0 }  


# Install llmware 

___  
**Using Pip Install**  

- Installing llmware is easy:  `pip3 install llmware` 


- If you prefer, we also provide a set of recent wheels in the [wheel archives](https://www.github.com/llmware-ai/llmware/tree/main/wheel_archives) in this repository, which can be downloaded individually and used as follows:  

```bash
pip3 install llmware-0.2.12-py3-none-any.wheel
````  

- We generally keep the main branch of this repository current with all changes, but we only publish new wheels to PyPi approximately once per week  

___

___
**Cloning the Repository**  

- If you prefer to clone the repository:  

```bash
git clone git@github.com:llmware-ai/llmware.git
```

- The llmware package is contained entirely in the /llmware folder path, so you should be able to drop this folder (with all of its contents) into a project tree, and use the llmware module essentially the same as a pip install.  

- Please ensure that you are capturing and updating the /llmware/lib folder, which includes required compiled shared libraries.  If you prefer, you can keep only those libs required for your OS platform.  

- After cloning the repo, we provide a short 'welcome to llmware' automation script, which can be used to install the projects requirements (from llmware/requirements.txt), install several optional dependencies that are commonly used in examples, copy several good 'getting started' examples into the root folder, and then run a 'welcome_example.py' script to get started using our models.  To use the "welcome to llmware" script:  

Windows:  
```bash
.\welcome_to_llmware_windows.sh
```

Mac/Linux:
```bash
sh ./welcome_to_llmware.sh
```

# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://huggingface.co/llmware"><span><img src="assets/images/hf-logo.svg" alt="Hugging Face" class="hugging-face-logo"/></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Advanced RAG
parent: Learn
nav_order: 4
description: overview of the major modules and classes of LLMWare  
permalink: /learn/advanced_techniques_for_rag
---
llmware Youtube Video Channel
---

**Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples.  

Check back often as this list is always growing ...  

🎬 **Advanced RAG Techniques **  
- [Best Small RAG Model - Bling-Phi-3](https://youtu.be/cViMonCAeSc?si=L6jX0sRdZAmKtRcz)  
- [Agent Automation with Web Services for Financial Research](https://youtu.be/l0jzsg1_Ik0?si=oBGtALHLplouY9x2)  
- [Voice Transcription and Automated Analysis of Greatest Speeches Dataset](https://youtu.be/5y0ez5ZBpPE?si=PAaCIqYou8nCGxYG)  
- [Are you prompting wrong for RAG - Stochastic Sampling-Part I](https://youtu.be/7oMTGhSKuNY?si=_KSjuBnqArvWzYbx)  
- [Are you prompting wrong for RAG - Stochastic Sampling-Part II- Code Experiments](https://youtu.be/iXp1tj-pPjM?si=3ZeMgipY0vJDHIMY)
- [Text2SQL Intro](https://youtu.be/BKZ6kO2XxNo?si=tXGt63pvrp_rOlIP)  
- [Pop up LLMWare Inference Server](https://www.youtube.com/watch?v=qiEmLnSRDUA&t=20s)   
- [Hardest Problem in RAG - handling 'Not Found'](https://youtu.be/slDeF7bYuv0?si=j1nkdwdGr5sgvUtK)  


# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
  <li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Core RAG Scenarios Running Locally
parent: Learn
nav_order: 2
description: overview of the major modules and classes of LLMWare  
permalink: /learn/core_rag_scenarios_running_locally
---
Core RAG Scenarios Run Locally
---

**Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples.  

Check back often as this list is always growing ...  

🎬 **Core RAG Scenarios**  

- [Use small LLMs for RAG for Contract Analysis (feat. LLMWare)](https://www.youtube.com/watch?v=8aV5p3tErP0)   
- [Invoice Processing with LLMware](https://www.youtube.com/watch?v=VHZSaBBG-Bo&t=10s)   
- [Evaluate LLMs for RAG with LLMWare](https://www.youtube.com/watch?v=s0KWqYg5Buk&t=105s)  
- [Fast Start to RAG with LLMWare Open Source Library](https://www.youtube.com/watch?v=0naqpH93eEU)  
- [Use Retrieval Augmented Generation (RAG) without a Database](https://www.youtube.com/watch?v=tAGz6yR14lw)
- [Best Small RAG Model - Bling-Phi-3](https://youtu.be/cViMonCAeSc?si=L6jX0sRdZAmKtRcz)
- [RAG with BLING on your laptop](https://www.youtube.com/watch?v=JjgqOZ2v5oU)     
- [DRAGON-7B-Models](https://www.youtube.com/watch?v=d_u7VaKu6Qk&t=37s)   


# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
<li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Voice Transcription with Whisper CPP
parent: Learn
nav_order: 6
description: overview of the major modules and classes of LLMWare  
permalink: /learn/integrated_voice_transcription_with_whisper_cpp
---
Integrated Voice Transcription with Whisper CPP
---

**Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples.  

Check back often as this list is always growing ...  

🎬 **Using Whisper CPP Models**
- [Getting Started with Whisper.CPP](https://youtu.be/YG5u5AOU9MQ?si=5xQYZCILPSiR8n4s)  
- [Voice Transcription and Automated Analysis of Greatest Speeches Dataset](https://youtu.be/5y0ez5ZBpPE?si=PAaCIqYou8nCGxYG)

# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Learn
nav_order: 4
has_children: true
description: key learning resources
permalink: /learn
---
Learn: Youtube Video Series
---

**Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples.  

Check back often as this list is always growing ...  

🎬 **Some of our most recent videos**  
- [Best Small RAG Model - Bling-Phi-3](https://youtu.be/cViMonCAeSc?si=L6jX0sRdZAmKtRcz)  
- [Agent Automation with Web Services for Financial Research](https://youtu.be/l0jzsg1_Ik0?si=oBGtALHLplouY9x2)  
- [Voice Transcription and Automated Analysis of Greatest Speeches Dataset](https://youtu.be/5y0ez5ZBpPE?si=PAaCIqYou8nCGxYG)  
- [Are you prompting wrong for RAG - Stochastic Sampling-Part I](https://youtu.be/7oMTGhSKuNY?si=_KSjuBnqArvWzYbx)  
- [Are you prompting wrong for RAG - Stochastic Sampling-Part II- Code Experiments](https://youtu.be/iXp1tj-pPjM?si=3ZeMgipY0vJDHIMY)  

🎬 **Using Agents, Function Calls and SLIM models**  
- [SLIMS Playlist](https://youtube.com/playlist?list=PL1-dn33KwsmAHWCWK6YjZrzicQ2yR6W8T&si=TSFGqQ3ObOO5vDde)  
- [Agent-based Complex Research Analysis](https://youtu.be/y4WvwHqRR60?si=jX3KCrKcYkM95boe)  
- [Getting Started with SLIMs (with code)](https://youtu.be/aWZFrTDmMPc?si=lmo98_quo_2Hrq0C)  
- [SLIM Models Intro](https://www.youtube.com/watch?v=cQfdaTcmBpY)  
- [Text2SQL Intro](https://youtu.be/BKZ6kO2XxNo?si=tXGt63pvrp_rOlIP)  
- [Pop up LLMWare Inference Server](https://www.youtube.com/watch?v=qiEmLnSRDUA&t=20s)   
- [Hardest Problem in RAG - handling 'Not Found'](https://youtu.be/slDeF7bYuv0?si=j1nkdwdGr5sgvUtK)  
- [Extract Information from Earnings Releases](https://youtu.be/d6HFfyDk4YE?si=VmnIiWFmgBtR4DxS)   
- [Summary Function Calls](https://youtu.be/yNg_KH5cPSk?si=Yl94tp_vKA8e7eT7)  
- [Boolean Yes-No Function Calls](https://youtu.be/jZQZMMqAJXs?si=lU4YVI0H0tfc9k6e)  
- [Autogenerate Topics, Tags and NER](https://youtu.be/N6oOxuyDsC4?si=vo2Fd8VG5xTbH4SD)  

🎬 **Using GGUF Models**  
- [Using LM Studio Models](https://www.youtube.com/watch?v=h2FDjUyvsKE)  
- [Using Ollama Models](https://www.youtube.com/watch?v=qITahpVDuV0)  
- [Use any GGUF Model](https://www.youtube.com/watch?v=9wXJgld7Yow)  
- [Background on GGUF Quantization & DRAGON Model Example](https://www.youtube.com/watch?v=ZJyQIZNJ45E)  
- [Getting Started with Whisper.CPP](https://youtu.be/YG5u5AOU9MQ?si=5xQYZCILPSiR8n4s)  

🎬 **Core RAG Scenarios Running Locally**  
- [RAG with BLING on your laptop](https://www.youtube.com/watch?v=JjgqOZ2v5oU)     
- [DRAGON-7B-Models](https://www.youtube.com/watch?v=d_u7VaKu6Qk&t=37s)   
- [Use small LLMs for RAG for Contract Analysis (feat. LLMWare)](https://www.youtube.com/watch?v=8aV5p3tErP0)   
- [Invoice Processing with LLMware](https://www.youtube.com/watch?v=VHZSaBBG-Bo&t=10s)   
- [Evaluate LLMs for RAG with LLMWare](https://www.youtube.com/watch?v=s0KWqYg5Buk&t=105s)  
- [Fast Start to RAG with LLMWare Open Source Library](https://www.youtube.com/watch?v=0naqpH93eEU)  
- [Use Retrieval Augmented Generation (RAG) without a Database](https://www.youtube.com/watch?v=tAGz6yR14lw)  


🎬 **Parsing, Embedding, Data Pipelines and Extraction**  
- [Ingest PDFs at Scale](https://www.youtube.com/watch?v=O0adUfrrxi8&t=10s)  
- [Install and Compare Multiple Embeddings with Postgres and PGVector](https://www.youtube.com/watch?v=Bncvggy6m5Q)   
- [Intro to Parsing and Text Chunking](https://youtu.be/2xDefZ4oBOM?si=YZzBUjDfQ0839EVF)  


# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
   <li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Other Topics
parent: Learn
nav_order: 7
description: overview of the major modules and classes of LLMWare  
permalink: /learn/other_topics
---
Other Notable Videos and Topics
---

**Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples.  

Check back often as this list is always growing ...  

🎬 **Some of our most recent videos**  
- [Fast Local Chatbot with Phi-3-GGUF](https://youtu.be/gzzEVK8p3VM?si=HTMWQtN9XuaqjmpK)   
- [Document Summarization](https://youtu.be/Ps3W-P9A1m8?si=mHvCcHvrKzndaNul)  
- [Agent Server](https://youtu.be/nsA6-ZdnkXg?si=v7iGhC_rpj8TWbbl)  
- [Best Small RAG Model - Bling-Phi-3](https://youtu.be/cViMonCAeSc?si=L6jX0sRdZAmKtRcz)  
- [Agent Automation with Web Services for Financial Research](https://youtu.be/l0jzsg1_Ik0?si=oBGtALHLplouY9x2)  
- [Voice Transcription and Automated Analysis of Greatest Speeches Dataset](https://youtu.be/5y0ez5ZBpPE?si=PAaCIqYou8nCGxYG)  
- [Are you prompting wrong for RAG - Stochastic Sampling-Part I](https://youtu.be/7oMTGhSKuNY?si=_KSjuBnqArvWzYbx)  
- [Are you prompting wrong for RAG - Stochastic Sampling-Part II- Code Experiments](https://youtu.be/iXp1tj-pPjM?si=3ZeMgipY0vJDHIMY)  


# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
  <li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Parsing Embedding and Data Extraction
parent: Learn
nav_order: 5
description: overview of the major modules and classes of LLMWare  
permalink: /learn/parsing_embedding_data_extraction
---
Parsing, Embedding, and Data Extraction
---

**Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples.  

Check back often as this list is always growing ...  

🎬 **Parsing, Embedding, Data Pipelines and Extraction**  
- [Advanced Parsing Techniques](https://youtu.be/dEsw8V_YBYY?si=B0GTVNhwfBYWkXyf)  
- [Ingest PDFs at Scale](https://www.youtube.com/watch?v=O0adUfrrxi8&t=10s)  
- [Install and Compare Multiple Embeddings with Postgres and PGVector](https://www.youtube.com/watch?v=Bncvggy6m5Q)   
- [Intro to Parsing and Text Chunking](https://youtu.be/2xDefZ4oBOM?si=YZzBUjDfQ0839EVF)  
- [Voice Transcription and Automated Analysis of Greatest Speeches Dataset](https://youtu.be/5y0ez5ZBpPE?si=PAaCIqYou8nCGxYG)  


# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
<li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Using Agents & Function Calls with SLIM Models
parent: Learn
nav_order: 1
description: overview of the major modules and classes of LLMWare  
permalink: /learn/using_agents_functions_slim_models
---
Using Agents, Function Calls and SLIM Models
---

**Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples.  

Check back often as this list is always growing ...  

🎬 **Using Agents, Function Calls and SLIM models**  
- [Agent Automation with Web Services for Financial Research](https://youtu.be/l0jzsg1_Ik0?si=oBGtALHLplouY9x2)  
- [Sentiment Analysis](https://youtu.be/ERCHP21oAN8?si=fp6D4Tk9J2HdDRXa)  
- [SLIMS Playlist](https://youtube.com/playlist?list=PL1-dn33KwsmAHWCWK6YjZrzicQ2yR6W8T&si=TSFGqQ3ObOO5vDde)  
- [Agent-based Complex Research Analysis](https://youtu.be/y4WvwHqRR60?si=jX3KCrKcYkM95boe)  
- [Getting Started with SLIMs (with code)](https://youtu.be/aWZFrTDmMPc?si=lmo98_quo_2Hrq0C)  
- [SLIM Models Intro](https://www.youtube.com/watch?v=cQfdaTcmBpY)  
- [Text2SQL Intro](https://youtu.be/BKZ6kO2XxNo?si=tXGt63pvrp_rOlIP)  
- [Pop up LLMWare Inference Server](https://www.youtube.com/watch?v=qiEmLnSRDUA&t=20s)   
- [Hardest Problem in RAG - handling 'Not Found'](https://youtu.be/slDeF7bYuv0?si=j1nkdwdGr5sgvUtK)  
- [Extract Information from Earnings Releases](https://youtu.be/d6HFfyDk4YE?si=VmnIiWFmgBtR4DxS)   
- [Summary Function Calls](https://youtu.be/yNg_KH5cPSk?si=Yl94tp_vKA8e7eT7)  
- [Boolean Yes-No Function Calls](https://youtu.be/jZQZMMqAJXs?si=lU4YVI0H0tfc9k6e)  
- [Autogenerate Topics, Tags and NER](https://youtu.be/N6oOxuyDsC4?si=vo2Fd8VG5xTbH4SD)  


# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---

---

---
layout: default
title: Using Quantized GGUF Models
parent: Learn
nav_order: 3
description: overview of the major modules and classes of LLMWare  
permalink: /learn/using_quantized_gguf_models
---
Using Quantized GGUF Models
---

**Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples.  

Check back often as this list is always growing ...  

🎬 **Using GGUF Models**  
- [Using LM Studio Models](https://www.youtube.com/watch?v=h2FDjUyvsKE)  
- [Using Ollama Models](https://www.youtube.com/watch?v=qITahpVDuV0)  
- [Use any GGUF Model](https://www.youtube.com/watch?v=9wXJgld7Yow)  
- [Background on GGUF Quantization & DRAGON Model Example](https://www.youtube.com/watch?v=ZJyQIZNJ45E)  
- [Getting Started with Whisper.CPP](https://youtu.be/YG5u5AOU9MQ?si=5xQYZCILPSiR8n4s)  
- [Best Small RAG Model - Bling-Phi-3](https://youtu.be/cViMonCAeSc?si=L6jX0sRdZAmKtRcz)
- [Agent Automation with Web Services for Financial Research](https://youtu.be/l0jzsg1_Ik0?si=oBGtALHLplouY9x2)  
- [Voice Transcription and Automated Analysis of Greatest Speeches Dataset](https://youtu.be/5y0ez5ZBpPE?si=PAaCIqYou8nCGxYG)  
- [Are you prompting wrong for RAG - Stochastic Sampling-Part I](https://youtu.be/7oMTGhSKuNY?si=_KSjuBnqArvWzYbx)  
- [Are you prompting wrong for RAG - Stochastic Sampling-Part II- Code Experiments](https://youtu.be/iXp1tj-pPjM?si=3ZeMgipY0vJDHIMY)  


# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)


# About the project

`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).

## Contributing
Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
You can also write an email or start a discussion on our Discrod channel.
Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).

## Code of conduct
We welcome everyone into the ``llmware`` community.
[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.

## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.

## License

`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).

## Thank you to the contributors of ``llmware``!
<ul class="list-style-none">
{% for contributor in site.github.contributors %}
  <li class="d-inline-block mr-1">
     <a href="{{ contributor.html_url }}">
        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
    </a>
  </li>
{% endfor %}
</ul>


---
<ul class="list-style-none">
    <li class="d-inline-block mr-1">
        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
    </li>
 <li class="d-inline-block mr-1">
    <a href="https://huggingface.co/llmware"><span> <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="Hugging Face" class="hugging-face-logo"/> </span></a>
     </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
    </li>
    <li class="d-inline-block mr-1">
        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
    </li>
</ul>
---