Advice on how to prompt coding agents

Dear CrewAI community,

This is not so much an issue, more so a request for advice on how to prompt coding agents.

My crew works uses agents that write their own code to analyse large datasets, using CodeInterpreterTool.

However, if you don’t provide any rules / guidelines for their code, it will often be very inconsistent and have a lot of errors.

I am writing a list of coding guidelines to pass in the agent prompt, so that my agent’s code generation is consistent, reliable, and it doesn’t commit the same mistakes that it is prone to committing.

I’m wondering whether there are any well-known, commonly used ‘coding guidelines’ prompts to pass to my agents to ensure consistency and fewer errors?

By the way, this is what my ‘coding guidelines’ prompt looks like:

    Generic Coding Guidelines:
        - Ensure all imports are properly imported and installed before code is run, and end execution immediately if fail
        - Use concise and efficient Python code to analyze the data
        - Consider computational efficiency for large datasets
        - Don't plot graphs
        - For each tool call, import the required packages again
        - If the tool doesn’t provide a final answer after 3 attempts, stop using it and return what you already have.
    
    Guidelines for using module references (such as np for numpy, pandas for pd):
        Import and use modules at the module level: Always import and use external modules (like numpy, pandas) at the top level of your script, not inside functions. This ensures proper module binding in isolated execution contexts.
        Good:
            import numpy as np
            import pandas as pd
            # Use np/pd at module level
            numeric_cols = df.select_dtypes(include=[np.number]).columns
            def analyze_data(df):
            for col in numeric_cols: # Use the pre-computed result
            # analysis code
        
        Avoid:
            import numpy as np
            import pandas as pd
            def analyze_data(df):
                # Don't use np inside functions 
                numeric_cols = df.select_dtypes(include=[np.number]).columns
        
        Pre-compute module-dependent values: If you need to use module functions or constants, compute them once at the module level and pass the results to functions as parameters.
        Use string-based alternatives when possible: For pandas operations, prefer string-based dtype selection over numpy types:
            # Prefer this:
            df.select_dtypes(include=['float64', 'int64'])
            # Over this:
            df.select_dtypes(include=[np.number])
        This ensures your code will work reliably without errors such as "name 'np' is not defined"

    Use of Codeinterpreter Tool:
        - You should only leave '\n' after each line when sending code to the codeinterpreter tool
        - Make sure to always:
            - PRINT your code's output
            - AND store the output in the variable "result" (THIS IS VERY IMPORTANT)
        - Try to complete all instructions in one tool call, only rerun if there are errors, for example parsing errors for columns.
        - However, if the dataset is very large, break up the dataset into smaller chunks and run the tool on each chunk
            - For example, if the dataset has a very large number of columns, break it up into smaller groups of columns, and run one tool call for each group
            - DO NOT BREAK UP THE DATASET BY ROWS, ONLY BY COLUMNS.

    Formatting of code output:
        - (THIS IS VERY IMPORTANT) In case the number of rows where an issue is identified is too large:
            - Limit the output to display a maximum of only 10 row numbers per issue.
            - In case the number of issues exceeds the maximum limit, provide just the 10 row numbers, and summarise the rest by providing a total number of rows which have issues.
        - When printing row numbers, ensure to ONLY output the row number, and not any additional values.
        - DO NOT print out row numbers as np.int64, or other such data types. ONLY print out the row number as an integer.

    Path Formatting:
        - Remember to use the full CSV file path provided to you when conducting analysis on the sensor data
        - Important: When using the full CSV file path in Python code, ALWAYS use one of these formats:
            1. Raw strings with single backslashes: r'C:\Users\path\to\file.csv'
            2. Forward slashes: 'C:/Users/path/to/file.csv'
            3. Double backslashes: 'C:\\Users\\path\\to\\file.csv'
        - NEVER use four backslashes in a path as it causes Python syntax errors.
"""

Additionally, I want to ask whether the agent prompt is the best place to put this kind of generic coding guidelines list.