Python DataFrame.from_arrow Exemples

Langage de programmation: Python

Espace de nommage/Pack: cudf.dataframe

Class/Type: DataFrame

Méthode/Fonction: from_arrow

Exemples au hotexamples.com: 2

Python DataFrame.from_arrow - 2 exemples trouvés. Ce sont les exemples réels les mieux notés de cudf.dataframe.DataFrame.from_arrow extraits de projets open source. Vous pouvez noter les exemples pour nous aider à en améliorer la qualité.

Méthodes fréquemment utilisées

Afficher Cacher

from_pandas(30)

DataFrame(30)

groupby(15)

merge(13)

set_index(9)

query(6)

copy(5)

one_hot_encoding(5)

from_records(4)

label_encoding(3)

as_gpu_matrix(2)

partition_by_hash(2)

from_arrow(2)

as_matrix(2)

replace(1)

sort_values(1)

set_tdf(1)

join(1)

nsmallest(1)

nlargest(1)

index(1)

head(1)

hash_columns(1)

get_tdf(1)

assign(1)

to_arrow(1)

Méthodes fréquemment utilisées

from_pandas (30)

DataFrame (30)

groupby (15)

merge (13)

set_index (9)

query (6)

copy (5)

one_hot_encoding (5)

from_records (4)

label_encoding (3)

Méthodes fréquemment utilisées

as_gpu_matrix (2)

partition_by_hash (2)

from_arrow (2)

as_matrix (2)

replace (1)

sort_values (1)

set_tdf (1)

join (1)

nsmallest (1)

nlargest (1)

index (1)

head (1)

hash_columns (1)

get_tdf (1)

assign (1)

to_arrow (1)

Méthodes fréquemment utilisées

index (1)

head (1)

hash_columns (1)

get_tdf (1)

assign (1)

to_arrow (1)

Exemple #1

0

Afficher le fichier

def main(): # end_year = 2016 # end_year is inclusive # part_count = 16 # the number of data files to train against # gpu_time = 0 gpu_dfs = [] perf_format_path = perf_data_path + "/Performance_%sQ%s.txt" # ETL stage ############################################################## time_ETL = time.time() for quarter in range(1, count_quarter_processing + 1): year = 2000 + quarter // 4 file = perf_format_path % (str(year), str(quarter % 4)) gpu_dfs.append( run_gpu_workflow(year=year, quarter=(quarter % 4), perf_file=file)) time_ETL_end = time.time() print("ETL time: ", time_ETL_end - time_ETL) ########################################################################## dxgb_gpu_params = { 'nround': 100, 'max_depth': 8, 'max_leaves': 2**8, 'alpha': 0.9, 'eta': 0.1, 'gamma': 0.1, 'learning_rate': 0.1, 'subsample': 1, 'reg_lambda': 1, 'scale_pos_weight': 2, 'min_child_weight': 30, 'tree_method': 'gpu_hist', 'n_gpus': 1, # 'distributed_dask': True, 'loss': 'ls', 'objective': 'gpu:reg:linear', 'max_features': 'auto', 'criterion': 'friedman_mse', 'grow_policy': 'lossguide', 'verbose': True } gpu_dfs = [DataFrame.from_arrow(gpu_df) for gpu_df in gpu_dfs] pd_df = gpu_dfs[0].to_pandas() y = pd_df["delinquency_12"] x = pd_df.drop(["delinquency_12"], axis=1) pd_df = xgb.DMatrix(x, y) bst = xgb.train(dxgb_gpu_params, pd_df, num_boost_round=dxgb_gpu_params['nround']) time_ML_train_end = time.time() print("Machine learning - train: ", time_ML_train_end - time_ETL_end)

Exemple #2

0

Afficher le fichier

def test_datetime_to_arrow(dtype): timestamp = (cudf.datasets.timeseries( start="2000-01-01", end="2000-01-02", freq="3600s", dtypes={}).reset_index()["timestamp"].reset_index(drop=True)) gdf = DataFrame({"timestamp": timestamp.astype(dtype)}) assert_eq(gdf, DataFrame.from_arrow(gdf.to_arrow(preserve_index=False)))