PythonforDataAnalysisDataWranglingwithPandas,NumPy,and无水印pdf
Python for Data Analysis Data Wrangling with Pandas, NumPy, and IPython(2nd) 英文无水印pdf 第2版 pdf所有页面使用FoxitReader和PDF-XChangeViewer测试都可以打开 本资源转载自网络,如有侵权,请联系上传者或csdn删除 本资源转载自网络,如有侵权,请联系上传者或csdn删除SECOND EDITIOPython for Data analysisData Wrangling with Pandas, NumPyand IpythonWes MckinneyBeijing.Boston. Farnham Sebastopol. Tokyo OREILLYPython for Data Analysisby Wes McKinneyCopyright o 2018 William McKinney. All rights reservedPrinted in the united states of americaPublished by O Reilly Media, InC, 1005 Gravenstein Highway North, Sebastopol, CA 95472OReilly books may be purchased for educational, business, or sales promotional use. Online editions arealsoavailableformosttitles(http://oreilly.com/safari).Formoreinformationcontactourcorporate/institutionalsalesdepartment800-998-9938orcorporate@oreilly.comEditor: Marie beaugureauIndexer: Lucie haskinsProduction editor Kristen brownInterior Designer: David FutatoCopyeditor: Jasmine KwitynCover Designer: Karen MontgomeryProofreader: Rachel MonaghanIllustrator: Rebecca demarestOctober 2012First editionOctober 2017.Second editionRevision history for the second edition2017-09-25: First releaseSeehttp://oreilly.com/catalog/errata.csp?isbn=9781491957660forreleasedetailsThe O Reilly logo is a registered trademark of O Reilly Media, Inc. Python for Data Analysis, the coverimage, and related trade dress are trademarks of O Reilly Media, IncWhile the publisher and the author have used good faith efforts to ensure that the information andined in this work arefor errors or omissions, including without limitation responsibility for damages resulting from the use ofor reliance on this work. Use of the information and instructions contained in this work is at your ownrisk. If any code samples or other technology this work contains or describes is subject to open sourcelicenses or the intellectual property rights of others, it is your responsibility to ensure that your usethereof complies with such licenses and/or rights978-1-491-95766-0Table of contentsPreface1. Preliminaries1. 1 What Is This book about?What Kinds of data?1.2 Why Python for Data analysis?Python as glSolving the Two-Language ProblemWhy not Python?1.3 Essential Python Librariespandasmatplotlibthon and JupyterSciPscikit-learnstatsmodels11122334445667889991.4 Installation and SetupWindowspple(os x, macOSGNU/LinuxInstalling or Updating Python PackagesPython 2 and Python 3Integrated Development Environments(IDEs) and Text Editors111. 5 Community and Conference121.6 Navigating This Book12Code examplesData for examplepl13Import conventions14rgon142. Python Language BasicS, IPython, and Jupyter Notebooks152. 1 The Python Interpreter162.2 IPython BasicsRunning the ipython shell17Running the Jupyter Notebook18Tab Completion21Introspection23The %run Command25Executing Code from the Clipboard26Terminal Keyboard Shortcuts27About Magic CommandsMatplotlib integration292.3 Python Language basicsLanguage semanticsScalar types38Control flow3. Built-in data Structures functions and files513. 1 Data Structures and Sequences51Tuple5154Built-in Sequence Functions59dict61setList, Set, and dict Comprehensions673.2 Functions69Namespaces, Scope, and Local Functions70Returning Multiple values71Functions Are Objects72Anonymous(Lambda) FunctionsCurrying: Partial Argument Application74Generators75Errors and Exception Handling3.3 Files and the Operating System80Bytes and Unicode with Files833.4 Conclusion844. NumPy Basics: Arrays and Vectorized Computation854. 1 The NumPy ndarray: A Multidimensional Array Object87iv Table of ContentsCreating ndarrays88Data Types for ndarraysArithmetic with Num Py arrays93Basic Indexing and Slicing94Boolean Indexing99Fancy Indexing102Transposing Arrays and Swapping Axes1034.2 Universal Functions: Fast Element-Wise Array Functions1054.3 Array-Oriented Programming with Arrays108Expressing Conditional Logic as Array Operations109Mathematical and Statistical Methods111Methods for boolean arrays113orting113Unique and Other Set Logic1144.4 File Input and Output with Arrays1154.5 Linear Algebra4.6 Pseudorandom Number generation1184.7 Example: Random Walks119Simulating Many Random Walks at Once1214.8 Conclusion1225. Getting Started with pandas.鲁··鲁。鲁·鲁1235.1 Introduction to pandas Data Structures124Series124Dataframe128Index Objects1345.2 Essential Functionality136Reindexing136Dropping Entries from an Axis138Indexing, Selection, and Filtering140Integer Indexes145Arithmetic and Data Alignment146Function Application and Mapping151Sorting and Ranking153Axis Indexes with Duplicate Labels1575.3 Summarizing and Computing Descriptive Statistics158Correlation and Covariance160Unique Values, Value Counts, and Membership1625.4 Conclusion1656. Data Loading, Storage, and File Formats.,1676. 1 Reading and Writing data in Text Format167Table of ContentsReading Text Files in PiecesWriting data to Text Format175Working with Delimited Formats176JSON Data178XML and HTML: Web Scraping1806.2 Binary Data Formats183Using HDF5 Format184Reading microsoft Excel Files1866.3 Interacting with Web APIs1876.4 Interacting with Databases1886. 5 Conclusion1907. Data Cleaning and Preparation.1917.1 Handling Missing Data191Filtering Out Missing Dat19Filling In Missing Data1957. 2 Data Transformation197Removing duplicates197Transforming Data USing a Function or Mapping198Replacing values200Renaming Axis Indexes201Discretization and Binning203Detecting and Filtering Outliers205Permutation and Random Sampling206Computing Indicator/Dummy variables2087. 3 String manipulation211String Object Methods211Regular expressions213Vectorized String Functions in pandas2167. 4 Conclusion2198. Data Wrangling: Join, Combine, and Reshape2218.1 Hierarchical Indexing221Reordering and sorting levels224Summary statistics by leve225Indexing with a Data Frames columns2258.2 Combining and merging datasets227Database-Style Data frame Joins227Merging on Index232Concatenating Along an Axis236Combining data with overlayap2418.3 Reshaping and Pivoting242Table of contentsReshaping with hierarchical Indexin243Pivoting Long to Wide Format246Pivoting Wide to Long Format2498. 4 Conclusion2519. Plotting and Visualization,2539. 1 A Brief matplotlib API Primer253igures and Subplots255Colors, Markers, and Line Styles259Ticks, Labels, and Legends261Annotations and Drawing on a Subplot265Saving Plots to File267matplotlib Configuration2689.2 Plotting with pandas and seaborn268Line plots269Bar plots272Histograms and density plots277atter or point plo280Facet Grids and Categorical Data2839. 3 Other Python Visualization Tools2859.4 Conclusion28610. Data Aggregation and Group Operations.............28710. 1 Group by mechanics288Iterating Over Ggroups291Selecting a Column or Subset of Columns293Grouping with Dicts and Series294Grouping with Functions295Grouping by Index levels29510.2 Data Aggregation296Column-Wise and multiple Function application298Returning Aggregated Data Without Row Indexes30110.3 Apply: General split-apply-combine302Suppressing the group Keys304Quantile and bucket Analysis305Example: Filling Missing Values with Group-Specific Values306Example: Random Sampling and Permutation308Example: Group Weighted Average and Correlation310Example: Group-Wise Linear Regression31210.4 Pivot Tables and Cross-Tabulation313Cross-Tabulations: Crosstab31510.5 Conclusion316Table of contents|ⅶi1. Time series31711.1 Date and Time Data Types and tools318Converting Between String and Datetime31911.2 Time series basics322Indexing, Selection, Subsetting323Time Series with Duplicate Indices32611.3 Date Ranges, Frequencies, and Shifting327Generating Date Ranges328Frequencies and Date Offsets330Shifting (Leading and Lagging)Data33211.4 Time Zone handling335Time Zone localization and Conversion335Operations with Time Zone-Aware Timestamp Objects338Operations Between Different Time Zones33911.5 Periods and Period Arithmetic339Period Frequency Conversion340Quarterly Period Frequencies342Converting Timestamps to Periods(and Back)344Creating a PeriodIndex from arrays34511.6 Resampling and Frequency Conversion348Downsampling349Upsampling and interpolation352Resampling with Periods35311.7 Moving window Functions354Exponentially Weighted Functions358Binary moving window Functions359User-Defined Moving Window Functions36111. 8 Conclusion3622. Advanced pandas,,36312.1 Categorical Data363Background and Motivation363Categorical Type in pandas365Computations with Categoricals367Categorical Methods37012.2 Advanced group by use373Group Transforms and"Unwrapped Group Bys373Grouped Time Resampling37712.3 Techniques for Method Chaining378The pipe Method38012. 4 Conclusion381ⅶ ii Table of Contents
下载地址
用户评论