API
XLSX.CellDataFormat — Type.Keeps track of formatting information.
XLSX.CellRange — Type.A CellRange represents a rectangular range of cells in a spreadsheet.
CellRange("A1:C4") denotes cells ranging from A1 (upper left corner) to C4 (bottom right corner).
As a convenience, @range_str macro is provided.
cr = XLSX.range"A1:C4"XLSX.CellRef — Type.A CellRef represents a cell location given by row and column identifiers.
CellRef("A6") indicates a cell located at column 1 and row 6.
Example:
cn = XLSX.CellRef("AB1")
println( XLSX.row_number(cn) ) # will print 1
println( XLSX.column_number(cn) ) # will print 28
println( string(cn) ) # will print out AB1As a convenience, @ref_str macro is provided.
cn = XLSX.ref"AB1"
println( XLSX.row_number(cn) ) # will print 1
println( XLSX.column_number(cn) ) # will print 28
println( string(cn) ) # will print out AB1XLSX.CellValue — Type.CellValue is a Julia type of a value read from a Spreadsheet.
XLSX.Relationship — Type.Relationships are defined in ECMA-376-1 Section 9.2. This struct matches the Relationship tag attribute names.
A Relashipship defines relations between the files inside a MSOffice package. Regarding Spreadsheets, there are two kinds of relationships:
* package level: defined in `_rels/.rels`.
* workbook level: defined in `xl/_rels/workbook.xml.rels`.The function parse_relationships!(xf::XLSXFile) is used to parse package and workbook level relationships.
XLSX.SharedStringTable — Type.Shared String Table
XLSX.SheetRowIterator — Type.Iterates over Worksheet cells. See eachrow method docs. Each element is a SheetRow.
Implementations: SheetRowStreamIterator, WorksheetCache.
XLSX.Workbook — Type.Workbook is the result of parsing file xl/workbook.xml.
XLSX.XLSXFile — Type.XLSXFile stores all XML data from an Excel file.
filepath is the filepath of the source file for this XLSXFile. data stored the raw XML data. It maps internal XLSX filenames to XMLDocuments. workbook is the result of parsing xl/workbook.xml.
Base.in — Method.Base.in(ref::CellRef, rng::CellRange) :: BoolChecks wether ref is a cell reference inside a range given by rng.
Base.issubset — Method.Base.issubset(subrng::CellRange, rng::CellRange)Checks wether subrng is a cell range contained in rng.
Base.iterate — Function.SheetRowStreamIterator(ws::Worksheet)Creates a reader for row elements in the Worksheet's XML. Will return a stream reader positioned in the first row element if it exists.
If there's no row element inside sheetData XML tag, it will close all streams and return nothing.
XLSX.add_relationship! — Method.Adds new relationship. Returns new generated rId.
XLSX.add_shared_string! — Method.add_shared_string!(sheet, str_unformatted, [str_formatted]) :: IntAdd string to shared string table. Returns the 0-based index of the shared string in the shared string table.
XLSX.addsheet! — Function.addsheet!(workbook, [name]) :: WorksheetCreate a new worksheet with named name. If name is not provided, a unique name is created.
XLSX.column_bounds — Method.column_bounds(sr::SheetRow)
Returns a tuple with the first and last index of the columns for a SheetRow.
XLSX.column_number — Method.column_number(c::CellRef) :: IntReturns the column number of a given cell reference.
XLSX.decode_column_number — Method.decode_column_number(column_name::AbstractString) :: IntConverts column name to a column number.
julia> XLSX.decode_column_number("D")
4See also: encode_column_number.
XLSX.default_cell_format — Method.Returns the default CellDataFormat for a type
XLSX.eachrow — Method.eachrow(sheet)Creates a row iterator for a worksheet.
Example: Query all cells from columns 1 to 4.
left = 1  # 1st column
right = 4 # 4th column
for sheetrow in XLSX.eachrow(sheet)
    for column in left:right
        cell = XLSX.getcell(sheetrow, column)
        # do something with cell
    end
endXLSX.eachtablerow — Method.eachtablerow(sheet, [columns]; [first_row], [column_labels], [header], [stop_in_empty_row], [stop_in_row_function])Constructs an iterator of table rows. Each element of the iterator is of type TableRow.
header is a boolean indicating wether the first row of the table is a table header.
If header == false and no names were supplied, column names will be generated following the column names found in the Excel file. Also, the column range will be inferred by the non-empty contiguous cells in the first row of the table.
The user can replace column names by assigning the optional names input variable with a Vector{Symbol}.
stop_in_empty_row is a boolean indicating wether an empty row marks the end of the table. If stop_in_empty_row=false, the iterator will continue to fetch rows until there's no more rows in the Worksheet. The default behavior is stop_in_empty_row=true. Empty rows may be returned by the iterator when stop_in_empty_row=false.
stop_in_row_function is a Function that receives a TableRow and returns a Bool indicating if the end of the table was reached.
Example for stop_in_row_function:
function stop_function(r)
    v = r[:col_label]
    return !ismissing(v) && v == "unwanted value"
endExample code:
for r in XLSX.eachtablerow(sheet)
    # r is a `TableRow`. Values are read using column labels or numbers.
    rn = XLSX.row_number(r) # `TableRow` row number.
    v1 = r[1] # will read value at table column 1.
    v2 = r[:COL_LABEL2] # will read value at column labeled `:COL_LABEL2`.
endSee also gettable.
XLSX.encode_column_number — Method.encode_column_number(column_number::Int) :: StringConverts column number to a column name.
Example
julia> XLSX.encode_column_number(4)
"D"See also: decode_column_number.
XLSX.excel_value_to_date — Method.Converts Excel number to Date.
See also: isdate1904 function.
XLSX.excel_value_to_datetime — Method.Converts Excel number to DateTime.
The decimal part represents the Time (see _time function). The integer part represents the Date.
See also: isdate1904 function.
XLSX.excel_value_to_time — Method.Converts Excel number to Time. x must be between 0 and 1.
To represent Time, Excel uses the decimal part of a floating point number. 1 equals one day.
XLSX.filenames — Method.Lists internal files from the XLSX package.
XLSX.get_dimension — Method.Retuns the dimension of this worksheet as a CellRange.
XLSX.get_shared_string_index — Method.Checks if string is inside shared string table. Returns nothing if it's not in the shared string table. Returns the index of the string in the shared string table. The index is 0-based.
XLSX.getcell — Method.getcell(sheet, ref)Returns an AbstractCell that represents a cell in the spreadsheet.
Example:
julia> xf = XLSX.readxlsx("myfile.xlsx")
julia> sheet = xf["mysheet"]
julia> cell = XLSX.getcell(sheet, "A1")XLSX.getcellrange — Method.getcellrange(sheet, rng)Returns a matrix with cells as Array{AbstractCell, 2}. rng must be a valid cell range, as in "A1:B2".
XLSX.getdata — Method.getdata(sheet, ref)Returns a escalar or a matrix with values from a spreadsheet. ref can be a cell reference or a range.
Indexing in a Worksheet will dispatch to getdata method.
Example
julia> f = XLSX.readxlsx("myfile.xlsx")
julia> sheet = f["mysheet"]
julia> v = sheet["A1:B4"]See also readdata.
XLSX.getdata — Method.getdata(ws::Worksheet, cell::Cell) :: CellValueReturns a Julia representation of a given cell value. The result data type is chosen based on the value of the cell as well as its style.
For example, date is stored as integers inside the spreadsheet, and the style is the information that is taken into account to chose Date as the result type.
For numbers, if the style implies that the number is visualized with decimals, the method will return a float, even if the underlying number is stored as an integer inside the spreadsheet XML.
If cell has empty value or empty String, this function will return missing.
XLSX.gettable — Method.gettable(sheet, [columns]; [first_row], [column_labels], [header], [infer_eltypes], [stop_in_empty_row], [stop_in_row_function]) -> data, column_labelsReturns tabular data from a spreadsheet as a tuple (data, column_labels). data is a vector of columns. column_labels is a vector of symbols. Use this function to create a DataFrame from package DataFrames.jl.
Use columns argument to specify which columns to get. For example, columns="B:D" will select columns B, C and D. If columns is not given, the algorithm will find the first sequence of consecutive non-empty cells.
Use first_row to indicate the first row from the table. first_row=5 will look for a table starting at sheet row 5. If first_row is not given, the algorithm will look for the first non-empty row in the spreadsheet.
header is a Bool indicating if the first row is a header. If header=true and column_labels is not specified, the column labels for the table will be read from the first row of the table. If header=false and column_labels is not specified, the algorithm will generate column labels. The default value is header=true.
Use column_labels as a vector of symbols to specify names for the header of the table.
Use infer_eltypes=true to get data as a Vector{Any} of typed vectors. The default value is infer_eltypes=false.
stop_in_empty_row is a boolean indicating wether an empty row marks the end of the table. If stop_in_empty_row=false, the TableRowIterator will continue to fetch rows until there's no more rows in the Worksheet. The default behavior is stop_in_empty_row=true.
stop_in_row_function is a Function that receives a TableRow and returns a Bool indicating if the end of the table was reached.
Example for stop_in_row_function:
function stop_function(r)
    v = r[:col_label]
    return !ismissing(v) && v == "unwanted value"
endRows where all column values are equal to missing are dropped.
Example code for gettable:
julia> using DataFrames, XLSX
julia> df = XLSX.openxlsx("myfile.xlsx") do xf
                DataFrame(XLSX.gettable(xf["mysheet"])...)
            endSee also: readtable.
XLSX.has_sst — Method.has_sst(workbook::Workbook)Checks wether this workbook has a Shared String Table.
XLSX.internal_xml_file_isread — Method.Returns true if the file data was read into xl.data.
XLSX.is_cache_enabled — Method.Indicates wether worksheet cache will be fed while reading worksheet cells.
XLSX.is_end_of_sheet_data — Method.Detects a closing sheetData element
XLSX.is_valid_cellname — Method.is_valid_cellname(n::AbstractString) :: BoolChecks wether n is a valid name for a cell.
Cell names are bounded by A1 : XFD1048576.
XLSX.is_writable — Method.is_writable(xl::XLSXFile)Indicates wether this XLSX file can be edited. This controls if assignment to worksheet cells is allowed. Writable XLSXFile instances are opened with XLSX.open_xlsx_template method.
XLSX.isdate1904 — Method.isdate1904(wb) :: BoolReturns true if workbook follows date1904 convention.
XLSX.open_empty_template — Function.open_empty_template(sheetname::AbstractString="") :: XLSXFileReturns an empty, writable XLSXFile with 1 worksheet.
sheetname is the name of the worksheet, defaults to Sheet1.
XLSX.open_internal_file_stream — Method.Open a file for streaming.
XLSX.open_xlsx_template — Method.open_xlsx_template(filepath::AbstractString) :: XLSXFileOpen an Excel file as template for editing and saving to another file with XLSX.writexlsx.
The returned XLSXFile instance is in closed state.
XLSX.openxlsx — Method.openxlsx(f::Function, filepath::AbstractString; mode::AbstractString="r", enable_cache::Bool=true)Open XLSX file for reading and/or writing. It returns an opened XLSXFile that will be automatically closed after applying f to the file.
Do syntax
This function should be used with do syntax, like in:
XLSX.openxlsx("myfile.xlsx") do xf
    # read data from `xf`
endFilemodes
The mode argument controls how the file is opened. The following modes are allowed:
- r: read mode. The existing data in- filepathwill be accessible for reading. This is the default mode.
- w: write mode. Opens an empty file that will be written to- filepath.
- rw: edit mode. Opens- filepathfor editing. The file will be saved to disk when the function ends.
Arguments
- filepathis the complete path to the file.
- modeis the file mode, as explained in the last section.
- enable_cache:
If enable_cache=true, all read worksheet cells will be cached. If you read a worksheet cell twice it will use the cached value instead of reading from disk in the second time.
If enable_cache=false, worksheet cells will always be read from disk. This is useful when you want to read a spreadsheet that doesn't fit into memory.
The default value is enable_cache=true.
Examples
Read from file
The following example shows how you would read worksheet cells, one row at a time, where myfile.xlsx is a spreadsheet that doesn't fit into memory.
julia> XLSX.openxlsx("myfile.xlsx", enable_cache=false) do xf
          for r in XLSX.eachrow(xf["mysheet"])
              # read something from row `r`
          end
       endWrite a new file
XLSX.openxlsx("new.xlsx", mode="w") do xf
    sheet = xf[1]
    sheet[1, :] = [1, Date(2018, 1, 1), "test"]
endEdit an existing file
XLSX.openxlsx("edit.xlsx", mode="rw") do xf
    sheet = xf[1]
    sheet[2, :] = [2, Date(2019, 1, 1), "add new line"]
endSee also readxlsx method.
XLSX.parse_file_mode — Method.Parses filemode string to the tuple (read, write). See openxlsx.
XLSX.parse_relationships! — Method.Parses package level relationships defined in _rels/.rels. Prases workbook level relationships defined in xl/_rels/workbook.xml.rels.
XLSX.parse_workbook! — Method.parse_workbook!(xf::XLSXFile)
Updates xf.workbook from xf.data["xl/workbook.xml"]
XLSX.readtable — Method.readtable(filepath, sheet, [columns]; [first_row], [column_labels], [header], [infer_eltypes], [stop_in_empty_row], [stop_in_row_function]) -> data, column_labelsReturns tabular data from a spreadsheet as a tuple (data, column_labels). data is a vector of columns. column_labels is a vector of symbols. Use this function to create a DataFrame from package DataFrames.jl.
Use columns argument to specify which columns to get. For example, columns="B:D" will select columns B, C and D. If columns is not given, the algorithm will find the first sequence of consecutive non-empty cells.
Use first_row to indicate the first row from the table. first_row=5 will look for a table starting at sheet row 5. If first_row is not given, the algorithm will look for the first non-empty row in the spreadsheet.
header is a Bool indicating if the first row is a header. If header=true and column_labels is not specified, the column labels for the table will be read from the first row of the table. If header=false and column_labels is not specified, the algorithm will generate column labels. The default value is header=true.
Use column_labels as a vector of symbols to specify names for the header of the table.
Use infer_eltypes=true to get data as a Vector{Any} of typed vectors. The default value is infer_eltypes=false.
stop_in_empty_row is a boolean indicating wether an empty row marks the end of the table. If stop_in_empty_row=false, the TableRowIterator will continue to fetch rows until there's no more rows in the Worksheet. The default behavior is stop_in_empty_row=true.
stop_in_row_function is a Function that receives a TableRow and returns a Bool indicating if the end of the table was reached.
Example for stop_in_row_function:
function stop_function(r)
    v = r[:col_label]
    return !ismissing(v) && v == "unwanted value"
endRows where all column values are equal to missing are dropped.
Example code for readtable:
julia> using DataFrames, XLSX
julia> df = DataFrame(XLSX.readtable("myfile.xlsx", "mysheet")...)
See also: `gettable`.XLSX.readxlsx — Method.readxlsx(filepath) :: XLSXFileMain function for reading an Excel file. This function will read the whole Excel file into memory and return a closed XLSXFile.
Consider using openxlsx for lazy loading of Excel file contents.
XLSX.relative_cell_position — Method.Returns (row, column) representing a ref position relative to rng.
For example, for a range "B2:D4", we have:
- "C3" relative position is (2, 2) 
- "B2" relative position is (1, 1) 
- "C4" relative position is (3, 2) 
- "D4" relative position is (3, 3) 
XLSX.row_number — Method.row_number(c::CellRef) :: IntReturns the row number of a given cell reference.
XLSX.sheet_column_numbers — Method.Returns real sheet column numbers (based on cellref)
XLSX.sheetcount — Method.Counts the number of sheets in the Workbook.
XLSX.sheetnames — Method.Lists Worksheet names for this Workbook.
XLSX.split_cellname — Method.split_cellname(n::AbstractString) -> column_name, row_numberSplits a string representing a cell name to its column name and row number.
Example
julia> XLSX.split_cellname("AB:12")
("AB:", 12)XLSX.split_cellrange — Method.split_cellrange(n::AbstractString) -> start_name, stop_nameSplits a string representing a cell range into its cell names.
Example
julia> XLSX.split_cellrange("AB12:CD24")
("AB12", "CD24")XLSX.split_column_range — Method.Returns tuple (columnnamestart, columnnamestop).
XLSX.sst_formatted_string — Method.sst_formatted_string(wb, index) :: StringLooks for a formatted string inside the Shared Strings Table (sst). index starts at 0.
XLSX.sst_unformatted_string — Method.sst_unformatted_string(wb, index) :: StringLooks for a string inside the Shared Strings Table (sst). index starts at 0.
XLSX.styles_add_font — Method.Defines a custom font. Returns the index to be used as the fontId in a cellXf definition.
XLSX.styles_add_numFmt — Method.Defines a custom number format to render numbers, dates or text. Returns the index to be used as the numFmtId in a cellXf definition.
XLSX.styles_cell_xf — Method.Returns the xf XML node element for style index. index is 0-based.
XLSX.styles_cell_xf_numFmtId — Method.Queries numFmtId from cellXfs -> xf nodes.
XLSX.styles_get_cellXf_with_numFmtId — Method.Cell Xf element follows the XML format below. This function queries the 0-based index of the first xf element that has the provided numFmtId. Returns -1 if not found.
<styleSheet ...
    <cellXfs count="5">
            <xf borderId="0" fillId="0" fontId="0" numFmtId="0" xfId="0"/>
            <xf applyNumberFormat="1" borderId="0" fillId="0" fontId="0" numFmtId="14" xfId="0"/>
            <xf applyNumberFormat="1" borderId="0" fillId="0" fontId="0" numFmtId="20" xfId="0"/>
            <xf applyNumberFormat="1" borderId="0" fillId="0" fontId="0" numFmtId="22" xfId="0"/>XLSX.styles_numFmt_formatCode — Method.Queries numFmt formatCode field by numFmtId.
XLSX.table_column_numbers — Method.Returns an iterator for table column numbers.
XLSX.table_column_to_sheet_column_number — Method.Maps table column index (1-based) -> sheet column index (cellref based)
XLSX.unformatted_text — Method.unformatted_text(el::EzXML.Node) :: StringHelper function to gather unformatted text from Excel data files. It looks at all childs of el for tag name t and returns a join of all the strings found.
XLSX.writetable — Method.writetable(filename, data, columnnames; [overwrite], [sheetname])data is a vector of columns. columnames is a vector of column labels. overwrite is a Bool to control if filename should be overwritten if already exists. sheetname is the name for the worksheet.
Example using DataFrames.jl:
import DataFrames, XLSX
df = DataFrames.DataFrame(integers=[1, 2, 3, 4], strings=["Hey", "You", "Out", "There"], floats=[10.2, 20.3, 30.4, 40.5])
XLSX.writetable("df.xlsx", DataFrames.columns(df), DataFrames.names(df))XLSX.writetable — Method.writetable(filename::AbstractString; overwrite::Bool=false, kw...)
writetable(filename::AbstractString, tables::Vector{Tuple{String, Vector{Any}, Vector{String}}}; overwrite::Bool=false)Write multiple tables.
kw is a variable keyword argument list. Each element should be in this format: sheetname=( data, column_names ), where data is a vector of columns and column_names is a vector of column labels.
Example:
import DataFrames, XLSX
df1 = DataFrames.DataFrame(COL1=[10,20,30], COL2=["Fist", "Sec", "Third"])
df2 = DataFrames.DataFrame(AA=["aa", "bb"], AB=[10.1, 10.2])
XLSX.writetable("report.xlsx", REPORT_A=( DataFrames.columns(df1), DataFrames.names(df1) ), REPORT_B=( DataFrames.columns(df2), DataFrames.names(df2) ))XLSX.writexlsx — Method.writexlsx(output_filepath, xlsx_file; [overwrite=false])Writes an Excel file given by xlsx_file::XLSXFile to file at path output_filepath.
If overwrite=true, output_filepath will be overwritten if it exists.
XLSX.xlsx_encode — Method.Returns the datatype and value for val to be inserted into ws.
XLSX.xmldocument — Method.xmldocument(xl::XLSXFile, filename::String) :: EzXML.DocumentUtility method to find the XMLDocument associated with a given package filename. Returns xl.data[filename] if it exists. Throws an error if it doesn't.
XLSX.xmlroot — Method.xmlroot(xl::XLSXFile, filename::String) :: EzXML.NodeUtility method to return the root element of a given XMLDocument from the package. Returns EzXML.root(xl.data[filename]) if it exists.