Mapping CIK, company name and ticker

Over the years I have encountered the issue of mapping between CIK, company name and stock ticker.

Below is a list of useful resources to help with this transcoding:

https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=AAPL&count=100&output=xml

  • Requires some additional data extraction

https://www.sec.gov/Archives/edgar/cik-lookup-data.txt

https://www.sec.gov/include/ticker.txt

A low memory, dataframe native accessor

I was recently running into memory issues (related to not closing a mutliprocessing.pool) and came up with the following piece of code while debugging:

import os
import pickle



class DynamoDataFrame:

    def render_fhandle(self, mode='wb'):
        return open(file=self.full_fpath, mode=mode)

    def __init__(self, full_fpath=None, df=None):
        self.full_fpath = full_fpath

        if df is not None:
            fhandle = self.render_fhandle()

            pickle.dump(obj=df, file=fhandle)
            fhandle.close()

    def load_pickle(self):
        return pickle.load(file=self.render_fhandle(mode='rb'))

    def __getitem__(self, item):
        return self.load_pickle()[item]

Although this was unrelated to my pool issue, it is a useful piece of code.

What’s nice is that the __get__ behaves like a series or dataframe accessor with only instantaneous in-memory load! I tend to call these things “dyamo” for dynamic.

https://gist.github.com/jasonmellone/511d80886f85331326d8c3bb74b32a41