SingularityNET-Archive
  • Welcome to SingularityNET-Archive
    • Workgroup Proposal
  • Ambassadors Program
    • Overview
  • Timeline
    • 2025
      • January 2025
        • Week 1
        • Week 2
        • Week 3
        • Week 4
        • Week 5
      • February 2025
        • Week 5
        • Week 6
        • Week 7
        • Week 8
        • Week 9
      • March 2025
        • Week 9
        • Week 10
        • Week 11
        • Week 12
        • Week 13
        • Week 14
      • April 2025
        • Week 14
        • Week 15
        • Week 16
        • Week 17
        • Week 18
      • May 2025
        • Week 18
        • Week 19
        • Week 20
        • Week 21
        • Week 22
      • June 2025
        • Week 22
        • Week 23
        • Week 24
        • Week 25
        • Week 26
        • Week 27
      • July 2025
        • Week 27
        • Week 28
        • Week 29
        • Week 30
        • Week 31
      • August 2025
        • Week 31
        • Week 32
        • Week 33
        • Week 34
        • Week 35
      • September 2025
        • Week 36
        • Week 37
        • Week 38
        • Week 39
        • Week 40
      • October 2025
        • Week 40
        • Week 41
        • Week 42
        • Week 43
        • Week 44
      • November 2025
        • Week 44
        • Week 45
        • Week 46
        • Week 47
        • Week 48
      • December 2025
        • Week 49
        • Week 50
        • Week 51
        • Week 52
        • Week 53
    • 2024
      • January 2024
        • Week 1
        • Week 2
        • Week 3
        • Week 4
        • Week 5
      • February 2024
        • Week 5
        • Week 6
        • Week 7
        • Week 8
        • Week 9
      • March 2024
        • Week 9
        • Week 10
        • Week 11
        • Week 12
        • Week 13
      • April 2024
        • Week 14
        • Week 15
        • Week 16
        • Week 17
        • Week 18
      • May 2024
        • Week 18
        • Week 19
        • Week 20
        • Week 21
        • Week 22
      • June 2024
        • Week 22
        • Week 23
        • Week 24
        • Week 25
        • Week 26
      • July 2024
        • Week 27
        • Week 28
        • Week 29
        • Week 30
        • Week 31
      • August 2024
        • Week 31
        • Week 32
        • Week 33
        • Week 34
        • Week 35
      • September 2024
        • Week 35
        • Week 36
        • Week 37
        • Week 38
        • Week 39
        • Week 40
      • October 2024
        • Week 40
        • Week 41
        • Week 42
        • Week 43
        • Week 44
      • November 2024
        • Week 44
        • Week 45
        • Week 46
        • Week 47
        • Week 48
      • December 2024
        • Week 48
        • Week 49
        • Week 50
        • Week 51
        • Week 52
        • Week 53
    • 2023
      • January 2023
        • Week 01
        • Week 02
        • Week 03
        • Week 04
        • Week 05
        • Week 06
      • February 2023
        • Week 06
        • Week 07
        • Week 08
        • Week 09
        • Week 10
      • March 2023
        • Week 10
        • Week 11
        • Week 12
        • Week 13
        • Week 14
      • April 2023
        • Week 14
        • Week 15
        • Week 16
        • Week 17
        • Week 18
      • May 2023
        • Week 19
        • Week 20
        • Week 21
        • Week 22
        • Week 23
      • June 2023
        • Week 23
        • Week 24
        • Week 25
        • Week 26
        • Week 27
      • July 2023
        • Week 27
        • Week 28
        • Week 29
        • Week 30
        • Week 31
        • Week 32
      • August 2023
        • Week 32
        • Week 33
        • Week 34
        • Week 35
        • Week 36
      • September 2023
        • Week 36
        • Week 37
        • Week 38
        • Week 39
        • Week 40
      • October 2023
        • Week 40
        • Week 41
        • Week 42
        • Week 43
        • Week 44
        • Week 45
      • November 2023
        • Week 45
        • Week 46
        • Week 47
        • Week 48
        • Week 49
      • December 2023
        • Week 49
        • Week 50
        • Week 51
        • Week 52
        • Week 53
    • 2022
      • January 2022
        • Week 1
        • Week 2
        • Week 3
        • Week 4
        • Week 5
        • Week 6
      • February 2022
        • Week 6
        • Week 7
        • Week 8
        • Week 9
        • Week 10
      • March 2022
        • Week 10
        • Week 11
        • Week 12
        • Week 13
        • Week 14
      • April 2022
        • Week 14
        • Week 15
        • Week 16
        • Week 17
        • Week 18
      • May 2022
        • Week 18
        • Week 19
        • Week 20
        • Week 21
        • Week 22
        • Week 23
      • June 2022
        • Week 23
        • Week 24
        • Week 25
        • Week 26
        • Week 27
      • July 2022
        • Week 27
        • Week 28
        • Week 29
        • Week 30
        • Week 31
      • August 2022
        • Week 32
        • Week 33
        • Week 34
        • Week 35
        • Week 36
      • September 2022
        • Week 36
        • Week 37
        • Week 38
        • Week 39
        • Week 40
      • October 2022
        • Week 40
        • Week 41
        • Week 42
        • Week 43
        • Week 44
        • Week 45
      • November 2022
        • Week 45
        • Week 46
        • Week 47
        • Week 48
        • Week 49
      • December 2022
        • Week 49
        • Week 50
        • Week 51
        • Week 52
        • Week 53
  • Development
    • Design
    • Documentation Automation
    • LLM Development
      • Retrieval-Augmented Generation
      • Data Loading and Preprocessing
      • Vector Store Creation
    • Research
      • Stalnaker’s Concept of Context
  • Links
    • Tools
      • Google
      • GitBook
      • GitHub
      • Medium
      • Miro
      • SingularityNET Links
        • SingularityNET Discord
        • SingularityNET Main Telegram
        • SingularityNET Announcement Channel
        • SingularityNET Website
    • AI Tools
      • Google colab - Python notebook
      • LangChain - development framework
      • Infranodus - network thinking
      • Read.ai - transcription tool
Powered by GitBook

This work is licensed under a Creative Commons Attribution 4.0 International License

On this page
Edit on GitHub
  1. Development
  2. LLM Development

Data Loading and Preprocessing

PreviousRetrieval-Augmented GenerationNextVector Store Creation

Last updated 6 months ago

This page is currently in draft

A generic approach may use an online Python interpreter such as .

A code example may be found at this location -

Here's what this code does:

  1. We use the built-in json module to load the JSON data directly from the file, instead of using the JSONLoader.

  2. We iterate over the root-level array data, and for each item (workgroup meeting object), we use the CharacterTextSplitter to split the string representation of the item into chunks. We extend the chunks list with the resulting chunks.

  3. We convert each chunk in the chunks list into a Document object.

  4. Finally, we iterate over the docs list and print the content of each document.

This solution assumes that your JSON data is a root-level array of workgroup meeting objects. If your JSON data has a different structure, you may need to modify the code accordingly.

Please note that this solution doesn't handle nested JSON structures or complex data types within the JSON objects.

If you need more advanced JSON parsing capabilities, you may want to consider using a dedicated JSON processing library like jsonpath-ng or jq directly.

Sample code for loading JSON files into Langchain

Typically a strucured text source will be used such as JSON.

An example of how to load JSON files into Langcahin may be found at this location -

Google Colab
LLM-Development/Colab/Data_Loading_and_Preprocessing.ipynb at main · SingularityNET-Archive/LLM-DevelopmentGitHub
LLM-Development/Colab/JSON_Loader.ipynb at main · SingularityNET-Archive/LLM-DevelopmentGitHub
Logo
Logo