What are the most common bugs in ARC input files and how to fix them?

Written by EST Admin
Updated 2 months ago

While we work on making our research apps as robust and error-tolerant as possible, some of you may still find it challenging to create input files that comply with the requirements of the abnormal return calculator.

On this page, we list the most common errors users make, provide illustrative screenshots, and explain how to fix the files.

How your files should look like

Before looking at the most common bugs, let's first have a look at what your files should look like. See screenshots of the sample data below, opened in a text editor, such as Microsoft Window's NotePad:

02_FirmData 03_MarketData

If you open your input files in the text editor and they look different, the abnormal return calculators will likely not perform the analysis and display an "application error".

The most common bugs we see

#1 Bug: Commas use instead of semicolons as separate

Example of the bug:

Why this issue arises: 

Excel or any other spreadsheet software typically inherits from your system settings your local scheme for separators and date formats.

When you store your file in CSV format, the spreadsheet software may place commas (e.g., in Germany).

EST ARC requires semicolons because commas are also often part of company names or are also used as a decimal separator.

The fix: Use your text editor's function to replace all commas with semicolons. 

#2 Bug: Random commas or semicolons at the end of your input file

Example of the bug:

Why this issue arises: 

When editing data in Excel or any other spreadsheet software, one regularly copies around whole columns and deletes some. 

When you then store your file in CSV format, the spreadsheet software often keeps those deleted/empty columns in your file.

EST ARC then finds more than the required 3 columns in the input files and has issues with reading your inputs.

The fix: Use your text editor's function and replace all occurrences of multiple commas or semicolons next to each other with nil/nothing - this factually deletes all unneeded commas/semicolons. Since there is no find & delete in most editors, you need to find & replace.

#3 Bug: Wrong date formats

Example of the bug:

Why this issue arises: 

Wrong date formats are a common error, for two main reasons: First, date conventions vary across countries, and spreadsheet software thus saves according to the local schemes. Second, sometimes dates are not recognized correctly in the spreadsheet program - when then stored as CSV, the date gets recorded as a number (see the third example in the picture on the left).

EST ARC allows two date formats: DD.MM.YYYY and YYYY-MM-DD.

The fix: There are two options if you have the wrong local scheme. Either change the local date scheme in your spreadsheet program or create the date string not as format date but as format text - for this, if you use Excel, use the formula concat().

#4 Bug: Duplicate data in firm or market files

Example of the bug:

Why this issue arises: 

This issue may arise when combining multiple data time series of the same index or firm given - which may be required if you have multiple events of the same firm/index in your sample.

EST does not allow for duplicate data since it cannot decide which of the entries should be the right one.

The fix: For fixing this bug, you will need your spreadsheet software. In MS Excel, for example, you can mark the data (company/index name and date) in your input files and select the function "delete duplicates". Please note that for a single company or index, there cannot be more than one closing price in one day.

Did this answer your question?