Common Formats Supported by Pandas

Format TypeDescriptionReader MethodWriter Method
CSVComma-Separated Valuesread_csvto_csv
JSONJavaScript Object Notationread_jsonto_json
HTMLHypertext Markup Languageread_htmlto_html
ExcelMicrosoft Excel (XLS, XLSX)read_excelto_excel
SQLSQL Databasesread_sql, read_sql_queryto_sql
ParquetColumnar Storageread_parquetto_parquet
PicklePython Object Serializationread_pickleto_pickle

File Handling with Examples

1. CSV Input/Output

  • Reading a CSV:

    df = pd.read_csv('example.csv')  # Reads CSV into DataFrame
  • Writing a CSV:

    df.to_csv('new_file.csv', index=False)  # Saves DataFrame to CSV

2. HTML Input/Output

  • Reading HTML Tables:

    tables = pd.read_html('https://en.wikipedia.org/wiki/World_population')
    • This reads all <table> elements from a webpage into a list of DataFrames.
  • Writing to HTML:

    df.to_html('output.html', index=False)

3. Excel Input/Output

  • Reading Excel Sheets:

    df = pd.read_excel('file.xlsx', sheet_name='Sheet1')
  • Writing to Excel:

    df.to_excel('output.xlsx', sheet_name='Sheet1', index=False)

4. SQL Connections

  • Creating a temporary SQLite database:

    from sqlalchemy import create_engine
    engine = create_engine('sqlite:///:memory:')
    df.to_sql('table_name', con=engine)
  • Querying the SQL database:

    result = pd.read_sql("SELECT * FROM table_name", con=engine)

Additional Notes

  • For Excel, install necessary libraries: pip install openpyxl xlrd.
  • For HTML: Ensure you have libraries like lxml, html5lib, and BeautifulSoup4.
  • For SQL: Use appropriate libraries for your database (e.g., sqlalchemy for SQLite).

Practical Tips

  1. Always validate the file paths and libraries needed for reading/writing specific formats.
  2. If working with large data, consider columnar formats like Parquet for better performance.
  3. For troubleshooting permissions or library-specific errors, refer to documentation or community resources.