Best practices for current data analysis

Any best practices for setting up research crew that should utilize tools to conduct current and historical data analysis from a variety of web and api sources? My crew seems to be utilizing the latest llm training timeframe eg information from 2023

@achris7 You need to connect the LLM to the web. Use a tool like Exa, for example. CrewAI has an implementation of it. See the EXASearchTool.

@rokbenko Indeed, using Exa and Serper

still pulling market data from 2023

Should I be using Knowledge? I’m a bit confused as to when to utilize Knowledge and when to utilize Custom Tools eg a Market Data API.

That’s very strange. Can you please share your full code? It should work. Let’s try to fix the issue!

Knowledge is meant for fact-based data you want to pass to the crew. For example, you have some internal business documents. In that case, you would utilize Knowledge. Why?

  1. It’s data that doesn’t change very often.
  2. It’s data that Exa cannot find searching the web.

Exa is meant for up-to-date data you want to pass to the crew. For example, you what the current weather is in your hometown. In that case, you would utilize Exa as a tool. Why?

  1. It’s data that changes every minute. It doesn’t make sense to provide this data utilizing Knowledge since after a minute has passed, it’s outdated data. You would need to go and update your Knowledge source every minute.
  2. It’s data that Exa can find searching the web.

FWIW I added a market data api as Knowledge and now the crew is using current data. Exa tool was / is not being utilized.

here’s one of the agents sans background goal etc (also have a config file to import exa and env file with the api creds)

        role='Alpha Crypto Research Analyst',
        goal='.',
        backstory=""": ....""",
        tools=[
            market_data_tool, 
            top_coins_tool,
            technical_analysis_tool,
            sentiment_tool,
            youtube_tool,
            exa_search,
            serper,
            scrape_website,
            firecrawl,
            code_interpreter
        ],
        knowledge_sources=[coingecko_knowledge],
        allow_delegation=True,
        verbose=True,
        llm=llm,
        memory=True,
        allow_code_execution=True
    )

and here’s the knowledge class

class CoinGeckoKnowledgeSource(BaseKnowledgeSource):
“”“Knowledge source that fetches comprehensive real-time market data from CoinGecko Pro API.”“”

limit: int = Field(
    default=250,  # Pro API supports higher limits
    description="Number of top cryptocurrencies to fetch data for"
)
vs_currency: str = Field(
    default="usd",
    description="Currency to get prices in"
)

def load_content(self) -> Dict[Any, str]:
    """Fetch and format comprehensive cryptocurrency market data using Pro API."""
    try:
        # Get comprehensive market data
        coins_data = cg.get_coins_markets(
            vs_currency=self.vs_currency,
            order='market_cap_desc',
            per_page=self.limit,
            sparkline=True,
            price_change_percentage='1h,24h,7d,14d,30d,200d,1y'
        )
        
        # Get global market data
        global_data = cg.get_global()
        
        # Get trending coins
        trending = cg.get_search_trending()
        
        # Format all data
        formatted_data = self._format_market_data(coins_data, global_data, trending)
        
        # Generate a unique ID based on timestamp to avoid ChromaDB conflicts
        timestamp = datetime.now().strftime('%Y%m%d%H%M%S%f')
        return {f"crypto_market_data_{timestamp}": formatted_data}
        
    except Exception as e:
        error_msg = str(e)
        print(f"CoinGecko Pro API Error: {error_msg}")
        timestamp = datetime.now().strftime('%Y%m%d%H%M%S%f')
        return {
            f"crypto_market_data_error_{timestamp}": f"""
            Cryptocurrency Market Data Error
            
            Failed to fetch market data from CoinGecko Pro API.
            Error: {error_msg}
            
            API Configuration:
            - Endpoint: {cg.api_base_url}
            - Plan: Analyst
            - Rate Limit: 500 calls/minute
            
            Please verify:
            1. API key validity
            2. Network connectivity
            3. API service status
            """
        }

def _format_market_data(self, coins_data: List[Dict], global_data: Dict, trending: Dict) -> str:
    """Format comprehensive cryptocurrency market data."""
    timestamp = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
    formatted = f"Cryptocurrency Market Data (as of {timestamp} UTC)\n\n"
    
    # Add global market data
    market_data = global_data.get('data', {})
    if market_data:
        formatted += f"""Global Market Overview:

Total Market Cap: ${market_data.get(‘total_market_cap’, {}).get(‘usd’, 0):,.2f}
24h Volume: ${market_data.get(‘total_volume’, {}).get(‘usd’, 0):,.2f}
BTC Dominance: {market_data.get(‘market_cap_percentage’, {}).get(‘btc’, 0):.2f}%
ETH Dominance: {market_data.get(‘market_cap_percentage’, {}).get(‘eth’, 0):.2f}%
Active Cryptocurrencies: {market_data.get(‘active_cryptocurrencies’, 0)}
Markets: {market_data.get(‘markets’, 0)}
\n"“”

    # Add trending coins
    if trending and trending.get('coins'):
        formatted += "Trending Coins:\n"
        for item in trending['coins'][:5]:  # Top 5 trending
            coin = item['item']
            formatted += f"• {coin['name']} ({coin['symbol'].upper()}) - Rank #{coin.get('market_cap_rank', 'N/A')}\n"
        formatted += "\n"
    
    # Add detailed market data
    if coins_data:
        formatted += "Top Cryptocurrencies by Market Cap:\n"
        for coin in coins_data:
            try:
                price_changes = {
                    '1h': coin.get('price_change_percentage_1h_in_currency', 0),
                    '24h': coin.get('price_change_percentage_24h', 0),
                    '7d': coin.get('price_change_percentage_7d_in_currency', 0),
                    '30d': coin.get('price_change_percentage_30d_in_currency', 0),
                    '1y': coin.get('price_change_percentage_1y_in_currency', 0)
                }
                
                formatted += f"""

{coin[‘name’]} ({coin[‘symbol’].upper()}):
Market Cap Rank: #{coin.get(‘market_cap_rank’, ‘N/A’)}
Price: ${coin.get(‘current_price’, 0):,.8f}
Market Cap: ${coin.get(‘market_cap’, 0):,.2f}
24h Volume: ${coin.get(‘total_volume’, 0):,.2f}
Price Changes:
1h: {price_changes[‘1h’]:>7.2f}%
24h: {price_changes[‘24h’]:>7.2f}%
7d: {price_changes[‘7d’]:>7.2f}%
30d: {price_changes[‘30d’]:>7.2f}%
1y: {price_changes[‘1y’]:>7.2f}%
Supply:
Circulating: {coin.get(‘circulating_supply’, 0):,.0f}
Total: {coin.get(‘total_supply’, 0):,.0f}
Max: {coin.get(‘max_supply’, ‘Unlimited’) if coin.get(‘max_supply’) else ‘Unlimited’}
-------------------“”"
except Exception as e:
continue

    return formatted

def add(self) -> None:
    """Process and store the market data."""
    content = self.load_content()
    
    for doc_id, text in content.items():
        chunks = self._chunk_text(text)
        self.chunks.extend(chunks)
    
    # Create metadata list matching number of chunks
    metadatas = [{
        'source': 'coingecko',
        'type': 'market_data',
        'timestamp': datetime.now().isoformat(),
        'currency': self.vs_currency,
        'limit': self.limit,
        'chunk_id': i
    } for i in range(len(self.chunks))]
    
    self.save_documents(metadata=metadatas)