import pandas as pd
import numpy as np
from datetime import datetime, time
# history file and batch size for processing.
historyFilePath = 'EURUSD.SAMPLE.csv'
batch_size = 5000
# function for date parsing
dateparse = lambda x: pd.datetime.strptime(x, '%Y-%m-%d %H:%M:%S.%f')
# load data into a pandas iterator with all the chunks
ratesFromCSVChunks = pd.read_csv(historyFilePath, index_col=0, engine='python', parse_dates=True,
date_parser=dateparse, header=None,
names=["datetime", "1_Current", "2_BidPx", "3_BidSz", "4_AskPx", "5_AskSz"],
iterator=True,
chunksize=batch_size)
# concatenate chunks to get the final array
ratesFromCSV = pd.concat([chunk for chunk in ratesFromCSVChunks])
# save final csv file
df.to_csv('EURUSD_processed.csv', date_format='%Y-%m-%d %H:%M:%S.%f',
columns=['1_Current', '2_BidPx', '3_BidSz', '4_AskPx', '5_AskSz'], header=False, float_format='%.5f')
I am reading a CSV file containing forex data in the format
2014-08-17 17:00:01.000000,1.33910,1.33910,1.00000,1.33930,1.00000
2014-08-17 17:00:01.000000,1.33910,1.33910,1.00000,1.33950,1.00000
2014-08-17 17:00:02.000000,1.33910,1.33910,1.00000,1.33930,1.00000
2014-08-17 17:00:02.000000,1.33900,1.33900,1.00000,1.33940,1.00000
2014-08-17 17:00:04.000000,1.33910,1.33910,1.00000,1.33950,1.00000
2014-08-17 17:00:05.000000,1.33930,1.33930,1.00000,1.33950,1.00000
2014-08-17 17:00:06.000000,1.33920,1.33920,1.00000,1.33960,1.00000
2014-08-17 17:00:06.000000,1.33910,1.33910,1.00000,1.33950,1.00000
2014-08-17 17:00:08.000000,1.33900,1.33900,1.00000,1.33942,1.00000
2014-08-17 17:00:16.000000,1.33900,1.33900,1.00000,1.33940,1.00000
How do you convert from Datatime in the CSV file or pandas dataframe being read to EPOCH time in MILLISECONDS from MIDNIGHT ( UTC or localized ) by the time it is being saved. Each file Starts at Midnight every day . The only thing being changed is the format of datetime to miilliseconds from midnight every day( UTC or localized) . The format i am looking for is:
43264234, 1.33910,1.33910,1.00000,1.33930,1.00000
43264739, 1.33910,1.33910,1.00000,1.33950,1.00000
43265282, 1.33910,1.33910,1.00000,1.33930,1.00000
43265789, 1.33900,1.33900,1.00000,1.33940,1.00000
43266318, 1.33910,1.33910,1.00000,1.33950,1.00000
43266846, 1.33930,1.33930,1.00000,1.33950,1.00000
43267353, 1.33920,1.33920,1.00000,1.33960,1.00000
43267872, 1.33910,1.33910,1.00000,1.33950,1.00000
43268387, 1.33900,1.33900,1.00000,1.33942,1.00000
Any help is well appreciated ( short & precise in Python 3.5 or Python 3.4 and above with Pandas 0.18.1 and numpy 1.11 )