File Formats for Long-Term Access

Open file formats help ensure access to your data over the long term.  

    Guidelines for selecting file formats

    Open file formats typically have the following characteristics:

    • Non-proprietary
    • Open, documented standard
    • Common usage by the research community
    • Standard representation (e.g. ASCII, Unicode)
    • Unencrypted
    • Uncompressed

    It is also important to document the software to access and use the data in a README file.

    Examples of preferred file formats

    Note that this is not an exhaustive listing (UK Data Service):

    • Text: PDF/A, RTF, TXT, XML
    • Audio: FLAC
    • Image: TIFF
    • Spreadsheet: CSV, TAB
    • Video: MP4, OGV, OGG, MJ2
    • Geospatial: SHP, SHX, DBF, PRJ, TIF, TFW, DWG, GML

    More information:

    Some resources for identifying preferred long-term preservation file formats include: