I'm trying to do some data ysis with pytn and pandas on a power consumption dataset.
However when I plot the data I get that stright line from 5-1-2007
to 13-1-2007
but I have no missing values in my dataset which is a weird behavior as I made sure that my dataset in clean.
Anyone had similar issue? or can explain this behavior?
Thank you.
编辑:这里是数据看起来像在该范围内编辑 2:这里是原始数据集的链接(清理前),如果这可能有助于:https://archive.ics.uci.edu/ml/machine-learning-databases/00235/
2007-01-01 和 2007-01-15 之间的数据看起来如何?(使用df[(df['Date_Time'] >= '2007-01-01 ') & (df['Date_Time'] <= '2007-01-15')]
)。
如果没有数据丢失,则可能是数据集已纵并且丢失的数据点已值(请参阅Interpolation)
事实是,当 x (Datetime) 轴上有数据时,那么如果 y 轴上没有数据,那么渲染无论如何都会继续。在周末和节假日的财务数据上或有缺口时尤其明显。这里描述了这个问题enter link description here
虽然你说数据存在,但仍然尝试这个代码,也许这是一个遗漏的问题。为了不在 y 轴没有数据时绘制,使用 'ticker.FuncFormatter (format_data)'。下面我附上我在数据文件中专门制作数据间隙的代码,以及它是如何变成的图片:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
df = pd.read_csv('custom.csv',
index_col='DATE',
p_dates=True,
infer_datetime_format=True)
z = df.iloc[:, 3].values
date = df.iloc[:, 0].index.date
fig, axes = plt.subplots(ncols=2)
ax = axes[0]
ax.plot(date, z)
ax.set_le("Default")
fig.autofmt_xdate()
N = len(z)
ind = np.arange(N)
def format_date(x, pos=None):
thisind = np.clip(int(x + 0.5), 0, N - 1)
return date[thisind].strftime('%Y-%m-%d')
ax = axes[1]
ax.plot(ind, z)
ax.xaxis.set_major_formatter(ticker.FuncFormatter(format_date))
ax.set_le("Witut empty values")
fig.autofmt_xdate()
plt.sw()
本站系公益性非盈利分享网址,本文来自用户投稿,不代表边看边学立场,如若转载,请注明出处
评论列表(8条)