The European Union Policy-Making (EUPOL) dataset

The EUPOL dataset includes virtually all information contained in the European Commission’s online database PreLex. The mission of PreLex is to monitor the inter-institutional decision-making process of the EU, providing information about various formal events and institutional actors involved in policy-making, as well as cross-references to documents contained in other online databases. In combination with its long-term coverage since the mid 1970s, PreLex is a useful database for studies of EU politics. The EUPOL dataset provides the complete information contained in PreLex in a standardized and machine-readable format. Overall, the latest version of the dataset consists of more than 33,673 decision-making cases, whose features are described by more than 2,700 variables. For further information, please see the data set description published in European Union Politics (see reference below) and the relevant blog posts.

Dataset description (please cite when using the dataset)


EUPOL v05 (1975-2014), raw information downloaded 17-18 September 2014
* Note that the Prelex data for 2013 and 2014 are incomplete; see this blog post for more details *

Previous versions of the dataset can be accessed here.


The scripts to download, extract, and store the information from PreLex were written in Python 2.6.5, using ActiveState’s PythonWin editor. The scripts rely on the following external Python modules: BeautifulSoup, ClientForm, and Mechanize.


The computer scripts used to generate the dataset were to a large extent written while I was a postdoctoral fellow in the Department of Public Administration at Leiden University in 2008. The postdoctoral fellowship was co-funded by the Netherlands Institute of Government.


The table below shows a comparison of the aggregate numbers on the type of legislative procedure coded from the EUPOL v01 and LawLeecher (Kovats 2010) datasets, respectively. While the match is not perfect, the comparison indicates no major discrepancies between the two datasets. The differences are most likely due to slightly different coding decisions rather than differences in the originally extracted information. Therefore, these preliminary results provide us with some confidence in the validity of the two independently programmed extraction procedures.

Legislative procedure EUPOL LawLeecher Difference Absolute Difference
Consultation 6166 6135 31 31
Agreement 1170 1147 23 23
Codecision 1112 1147 -35 35
Cooperation 580 511 69 69
Assent 251 242 9 9
Consultation ECB 16 17 -1 1
Consultation CoA 3 3 0 0
Social Protocol 3 2 1 1
Special Legislative Procedure 2 0 2 2
Total 9303 9204 99 171