Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Double in CellValue by Scientific #177

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
4 changes: 2 additions & 2 deletions src/Codec/Xlsx/Parser.hs
Original file line number Diff line number Diff line change
Expand Up @@ -336,7 +336,7 @@ extractSheetFast ar sst contentTypes caches wf = do
Nothing -> throwError "bad shared string index"
"inlineStr" -> mapM (fmap xlsxTextToCellValue . fromXenoNode) isNode
"str" -> fmap CellText <$> vConverted
"n" -> fmap CellDouble <$> vConverted
"n" -> fmap CellDecimal <$> vConverted
"b" -> fmap CellBool <$> vConverted
"e" -> fmap CellError <$> vConverted
unexpected ->
Expand Down Expand Up @@ -514,7 +514,7 @@ extractCellValue sst t cur
| t == "inlineStr" =
cur $/ element (n_ "is") >=> fmap xlsxTextToCellValue . fromCursor
| t == "str" = CellText <$> vConverted "string"
| t == "n" = CellDouble <$> vConverted "double"
| t == "n" = CellDecimal <$> vConverted "scientific"
| t == "b" = CellBool <$> vConverted "boolean"
| t == "e" = CellError <$> vConverted "error"
| otherwise = fail "bad cell value"
Expand Down
4 changes: 4 additions & 0 deletions src/Codec/Xlsx/Parser/Internal.hs
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ import Data.Maybe
import Data.Text (Text)
import qualified Data.Text as T
import qualified Data.Text.Read as T
import Data.Scientific (Scientific)
import Data.Typeable (Typeable)
import GHC.Generics (Generic)
import Text.XML
Expand Down Expand Up @@ -68,6 +69,9 @@ instance FromAttrVal Integer where
instance FromAttrVal Double where
fromAttrVal = T.rational

instance FromAttrVal Scientific where
fromAttrVal = T.rational

instance FromAttrVal Bool where
fromAttrVal x | x == "1" || x == "true" = readSuccess True
| x == "0" || x == "false" = readSuccess False
Expand Down
5 changes: 5 additions & 0 deletions src/Codec/Xlsx/Parser/Internal/Fast.hs
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ import qualified Data.ByteString.Unsafe as SU
import Data.Char (chr)
import Data.Maybe
import Data.Monoid ((<>))
import Data.Scientific (Scientific)
import Data.Text (Text)
import qualified Data.Text as T
import qualified Data.Text.Encoding as T
Expand Down Expand Up @@ -231,6 +232,10 @@ instance FromAttrBs Double where
-- as for rationals
fromAttrBs = first T.pack . eitherRational . T.decodeLatin1

instance FromAttrBs Scientific where
-- as for rationals
fromAttrBs = first T.pack . eitherRational . T.decodeLatin1

instance FromAttrBs Text where
fromAttrBs = replaceEntititesBs

Expand Down
3 changes: 2 additions & 1 deletion src/Codec/Xlsx/Parser/Internal/PivotTable.hs
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ import Control.Applicative
import Data.ByteString.Lazy (ByteString)
import Data.List (transpose)
import Data.Maybe (listToMaybe, mapMaybe, maybeToList)
import Data.Scientific (fromFloatDigits)
import Data.Text (Text)
import Safe (atMay)
import Text.XML
Expand Down Expand Up @@ -105,5 +106,5 @@ fillCacheFieldsFromRecords fields recs =
then field {cfItems = mapMaybe recToCellValue recVals}
else field
recToCellValue (CacheText t) = Just $ CellText t
recToCellValue (CacheNumber n) = Just $ CellDouble n
recToCellValue (CacheNumber n) = Just $ CellDecimal (fromFloatDigits n)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use Scientific in CacheNumber as well?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I don't understand what CacheNumber is, I trust in your judgement here.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically as far as I remember it's the same thing but for pivot table cache

recToCellValue (CacheIndex _) = Nothing
2 changes: 1 addition & 1 deletion src/Codec/Xlsx/Parser/Internal/Util.hs
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ eitherDecimal t = case T.signed T.decimal t of
rational :: (MonadFail m) => Text -> m Double
rational = fromEither . eitherRational

eitherRational :: Text -> Either String Double
eitherRational :: Fractional a => Text -> Either String a
eitherRational t = case T.signed T.rational t of
Right (r, leftover) | T.null leftover -> Right r
_ -> Left $ "invalid rational: " ++ show t
Expand Down
2 changes: 1 addition & 1 deletion src/Codec/Xlsx/Parser/Stream.hs
Original file line number Diff line number Diff line change
Expand Up @@ -526,7 +526,7 @@ parseValue sstrings txt = \case
string <- maybe (Left $ SharedStringsNotFound idx sstrings) Right $ {-# SCC "sstrings_lookup_scc" #-} (sstrings ^? ix idx)
Right $ CellText string
TStr -> pure $ CellText txt
TN -> bimap (ReadError txt) (CellDouble . fst) $ Read.double txt
TN -> bimap (ReadError txt) (CellDecimal . fst) $ Read.rational txt
TE -> bimap (ReadError txt) (CellError . fst) $ fromAttrVal txt
TB | txt == "1" -> Right $ CellBool True
| txt == "0" -> Right $ CellBool False
Expand Down
27 changes: 24 additions & 3 deletions src/Codec/Xlsx/Types/Common.hs
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
{-# LANGUAGE CPP #-}
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE ViewPatterns #-}
{-# LANGUAGE PatternSynonyms #-}

module Codec.Xlsx.Types.Common
( CellRef(..)
Expand Down Expand Up @@ -39,6 +41,7 @@ module Codec.Xlsx.Types.Common
, xlsxTextToCellValue
, Formula(..)
, CellValue(..)
, pattern CellDouble
, ErrorType(..)
, DateBase(..)
, dateFromNumber
Expand All @@ -51,7 +54,7 @@ module Codec.Xlsx.Types.Common
, _XlsxText
, _XlsxRichText
, _CellText
, _CellDouble
, _CellDecimal
, _CellBool
, _CellRich
, _CellError
Expand All @@ -72,6 +75,7 @@ import Data.Maybe (isJust, fromMaybe)
import Data.Function ((&))
import Data.Ix (inRange)
import qualified Data.Map as Map
import Data.Scientific (Scientific,toRealFloat,fromFloatDigits)
import Data.Text (Text)
import qualified Data.Text as T
import qualified Data.Text.Encoding as T
Expand Down Expand Up @@ -411,12 +415,21 @@ instance NFData Formula
-- - 18.18.11 ST_CellType (Cell Type)
data CellValue
= CellText Text
| CellDouble Double
| CellDecimal Scientific
| CellBool Bool
| CellRich [RichTextRun]
| CellError ErrorType
deriving (Eq, Ord, Show, Generic)
{-# COMPLETE CellText, CellDecimal, CellBool, CellRich, CellError #-}

viewCellDouble :: CellValue -> Maybe Double
viewCellDouble (CellDecimal s) = Just (toRealFloat s)
viewCellDouble _ = Nothing

-- view pattern, since 'CellDecimal' has replaced the old constructor.
pattern CellDouble :: Double -> CellValue
pattern CellDouble b <- (viewCellDouble -> Just b)
{-# COMPLETE CellText, CellDouble, CellBool, CellRich, CellError #-}

instance NFData CellValue

Expand Down Expand Up @@ -677,9 +690,17 @@ _CellText
CellText y1_a1ZQx -> Right y1_a1ZQx
_ -> Left x_a1ZQw)
{-# INLINE _CellText #-}
_CellDecimal :: Prism' CellValue Scientific
_CellDecimal
= (prism (\ x1_a1ZQy -> CellDecimal x1_a1ZQy))
(\ x_a1ZQz
-> case x_a1ZQz of
CellDecimal y1_a1ZQA -> Right y1_a1ZQA
_ -> Left x_a1ZQz)
{-# INLINE _CellDecimal #-}
_CellDouble :: Prism' CellValue Double
_CellDouble
= (prism (\ x1_a1ZQy -> CellDouble x1_a1ZQy))
= (prism (\ x1_a1ZQy -> CellDecimal (fromFloatDigits x1_a1ZQy)))
(\ x_a1ZQz
-> case x_a1ZQz of
CellDouble y1_a1ZQA -> Right y1_a1ZQA
Expand Down
2 changes: 1 addition & 1 deletion src/Codec/Xlsx/Types/PivotTable/Internal.hs
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ instance FromCursor CacheField where
cellValueFromNode :: Node -> [CellValue]
cellValueFromNode n
| n `nodeElNameIs` (n_ "s") = CellText <$> attributeV
| n `nodeElNameIs` (n_ "n") = CellDouble <$> attributeV
| n `nodeElNameIs` (n_ "n") = CellDecimal <$> attributeV
| otherwise = fail "no matching shared item"
where
cur = fromNode n
Expand Down
8 changes: 4 additions & 4 deletions test/Main.hs
Original file line number Diff line number Diff line change
Expand Up @@ -117,10 +117,10 @@ floatsParsingTests parser = do
let xlsx = parser bs
parsedCells = maybe mempty (_wsCells . snd) $ listToMaybe $ xlsx ^. xlSheets
expectedCells = M.fromList
[ ((1,1), def & cellValue ?~ CellDouble 12.0)
, ((2,1), def & cellValue ?~ CellDouble 13.0)
, ((3,1), def & cellValue ?~ CellDouble 14.0 & cellStyle ?~ 1)
, ((4,1), def & cellValue ?~ CellDouble 15.0)
[ ((1,1), def & cellValue ?~ CellDecimal 12.0)
, ((2,1), def & cellValue ?~ CellDecimal 13.0)
, ((3,1), def & cellValue ?~ CellDecimal 14.0 & cellStyle ?~ 1)
, ((4,1), def & cellValue ?~ CellDecimal 15.0)
]
expectedCells @==? parsedCells

Expand Down
14 changes: 7 additions & 7 deletions test/PivotTableTests.hs
Original file line number Diff line number Diff line change
Expand Up @@ -106,24 +106,24 @@ testPivotSrcCells =
where
cellMap =
[ [CellText "Color", CellText "Year", CellText "Price", CellText "Count"]
, [CellText "green", CellDouble 2012, CellDouble 12.23, CellDouble 17]
, [CellText "white", CellDouble 2011, CellDouble 73.99, CellDouble 21]
, [CellText "red", CellDouble 2012, CellDouble 10.19, CellDouble 172]
, [CellText "white", CellDouble 2012, CellDouble 34.99, CellDouble 49]
, [CellText "green", CellDecimal 2012, CellDecimal 12.23, CellDecimal 17]
, [CellText "white", CellDecimal 2011, CellDecimal 73.99, CellDecimal 21]
, [CellText "red", CellDecimal 2012, CellDecimal 10.19, CellDecimal 172]
, [CellText "white", CellDecimal 2012, CellDecimal 34.99, CellDecimal 49]
]

testPivotCacheFields :: [CacheField]
testPivotCacheFields =
[ CacheField
(PivotFieldName "Color")
[CellText "green", CellText "white", CellText "red"]
, CacheField (PivotFieldName "Year") [CellDouble 2012, CellDouble 2011]
, CacheField (PivotFieldName "Year") [CellDecimal 2012, CellDecimal 2011]
, CacheField
(PivotFieldName "Price")
[CellDouble 12.23, CellDouble 73.99, CellDouble 10.19, CellDouble 34.99]
[CellDecimal 12.23, CellDecimal 73.99, CellDecimal 10.19, CellDecimal 34.99]
, CacheField
(PivotFieldName "Count")
[CellDouble 17, CellDouble 21, CellDouble 172, CellDouble 49]
[CellDecimal 17, CellDecimal 21, CellDecimal 172, CellDecimal 49]
]

testPivotTableDefinition :: ByteString
Expand Down
12 changes: 6 additions & 6 deletions test/StreamTests.hs
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,7 @@ smallWorkbook = def & atSheet "Sheet1" ?~ sheet
[((row,1), a1)
, ((row,2), def & cellValue ?~ CellText ("text at B"<> tshow row <> " Sheet1"))
, ((row,3), def & cellValue ?~ CellText "text at C1 Sheet1")
, ((row,4), def & cellValue ?~ CellDouble (0.2 + 0.1))
, ((row,4), def & cellValue ?~ CellDecimal (0.2 + 0.1))
, ((row,5), def & cellValue ?~ CellBool False)
]
-- sheets = [("Sheet1" , toWs $ [1..2] >>= \row ->
Expand All @@ -170,7 +170,7 @@ smallWorkbook = def & atSheet "Sheet1" ?~ sheet
-- , ((RowIndex row, ColumnIndex 3),
-- def & cellValue ?~ CellText "text at C1 Sheet1")
-- , ((RowIndex row, ColumnIndex 4),
-- def & cellValue ?~ CellDouble (0.2 + 0.1))
-- def & cellValue ?~ CellDecimal (0.2 + 0.1))
-- , ((RowIndex row, ColumnIndex 5),
-- def & cellValue ?~ CellBool False)
-- ]
Expand Down Expand Up @@ -224,13 +224,13 @@ untypedCellsAreParsedAsFloats = do
-- as numbers explicitly in `t` attribute.
items <- runXlsxM "data/floats.xlsx" $ collectItems $ makeIndex 1
let expected =
[ IM.fromList [ (1, def & cellValue ?~ CellDouble 12.0) ]
, IM.fromList [ (1, def & cellValue ?~ CellDouble 13.0) ]
[ IM.fromList [ (1, def & cellValue ?~ CellDecimal 12.0) ]
, IM.fromList [ (1, def & cellValue ?~ CellDecimal 13.0) ]
-- cell below has explicit `Numeric` type, while others are all `General`,
-- but sometimes excel does not add a `t="n"` attr even to numeric cells
-- but it should be default as number in any cases if `t` is missing
, IM.fromList [ (1, def & cellValue ?~ CellDouble 14.0 & cellStyle ?~ 1 ) ]
, IM.fromList [ (1, def & cellValue ?~ CellDouble 15.0) ]
, IM.fromList [ (1, def & cellValue ?~ CellDecimal 14.0 & cellStyle ?~ 1 ) ]
, IM.fromList [ (1, def & cellValue ?~ CellDecimal 15.0) ]
]
expected @==? (_ri_cell_row . _si_row <$> items)

Expand Down
30 changes: 15 additions & 15 deletions test/TestXlsx.hs
Original file line number Diff line number Diff line change
Expand Up @@ -146,16 +146,16 @@ testCellMap1 = M.fromList [ ((1, 2), cd1_2), ((1, 5), cd1_5), ((1, 10), cd1_10)
where
cd v = def {_cellValue=Just v}
cd1_2 = cd (CellText "just a text, fließen, русский <> и & \"in quotes\"")
cd1_5 = cd (CellDouble 42.4567)
cd1_5 = cd (CellDecimal 42.4567)
cd1_10 = cd (CellText "")
cd3_1 = cd (CellText "another text")
cd3_2 = def -- shouldn't it be skipped?
cd3_3 = def & cellValue ?~ CellError ErrorDiv0
& cellFormula ?~ simpleCellFormula "1/0"
cd3_7 = cd (CellBool True)
cd4_1 = cd (CellDouble 1)
cd4_2 = cd (CellDouble 123456789012345)
cd4_3 = (cd (CellDouble (1+2))) { _cellFormula =
cd4_1 = cd (CellDecimal 1)
cd4_2 = cd (CellDecimal 123456789012345)
cd4_3 = (cd (CellDecimal (1+2))) { _cellFormula =
Just $ simpleCellFormula "A4+B4<>11"
}
cd5_1 = def & cellFormula ?~ sharedFormulaByIndex (SharedFormulaIndex 0)
Expand All @@ -169,16 +169,16 @@ cellRangeDvSourceMap = M.fromList [ ((1, 1), def & cellValue ?~ CellText "A-A-A"
, ((2, 1), def & cellValue ?~ CellText "B-B-B")
, ((1, 2), def & cellValue ?~ CellText "C-C-C")
, ((2, 2), def & cellValue ?~ CellText "D-D-D")
, ((1, 3), def & cellValue ?~ CellDouble 6)
, ((2, 3), def & cellValue ?~ CellDouble 7)
, ((3, 1), def & cellValue ?~ CellDouble 5)
, ((1, 3), def & cellValue ?~ CellDecimal 6)
, ((2, 3), def & cellValue ?~ CellDecimal 7)
, ((3, 1), def & cellValue ?~ CellDecimal 5)
, ((3, 2), def & cellValue ?~ CellText "numbers!")
, ((3, 3), def & cellValue ?~ CellDouble 5)
, ((3, 3), def & cellValue ?~ CellDecimal 5)
]

testCellMap2 :: CellMap
testCellMap2 = M.fromList [ ((1, 2), def & cellValue ?~ CellText "something here")
, ((3, 5), def & cellValue ?~ CellDouble 123.456)
, ((3, 5), def & cellValue ?~ CellDecimal 123.456)
, ((2, 4),
def & cellValue ?~ CellText "value"
& cellComment ?~ comment1
Expand Down Expand Up @@ -401,12 +401,12 @@ testFormattedResult = Formatted cm styleSheet merges
, _cellFormula = Nothing }
cell12 = Cell
{ _cellStyle = Just 2
, _cellValue = Just (CellDouble 1.23)
, _cellValue = Just (CellDecimal 1.23)
, _cellComment = Nothing
, _cellFormula = Nothing }
cell25 = Cell
{ _cellStyle = Just 3
, _cellValue = Just (CellDouble 1.23456)
, _cellValue = Just (CellDecimal 1.23456)
, _cellComment = Nothing
, _cellFormula = Nothing }
merges = []
Expand Down Expand Up @@ -441,10 +441,10 @@ testRunFormatted = formatted formattedCellMap minimalStyleSheet
& fontName ?~ "Calibri"
at (1, 1) ?= (def & formattedCell . cellValue ?~ CellText "text at A1"
& formattedFormat . formatFont ?~ font1)
at (1, 2) ?= (def & formattedCell . cellValue ?~ CellDouble 1.23
at (1, 2) ?= (def & formattedCell . cellValue ?~ CellDecimal 1.23
& formattedFormat . formatFont . non def . fontItalic ?~ True
& formattedFormat . formatNumberFormat ?~ fmtDecimalsZeroes 4)
at (2, 5) ?= (def & formattedCell . cellValue ?~ CellDouble 1.23456
at (2, 5) ?= (def & formattedCell . cellValue ?~ CellDecimal 1.23456
& formattedFormat . formatNumberFormat ?~ StdNumberFormat Nf2Decimal)

testFormatWorkbookResult :: Xlsx
Expand All @@ -456,7 +456,7 @@ testFormatWorkbookResult = def & xlSheets .~ sheets
, _cellComment = Nothing
, _cellFormula = Nothing })]
cellMap2 = M.fromList [((2, 3), Cell { _cellStyle = Just 1
, _cellValue = Just (CellDouble 1.23456)
, _cellValue = Just (CellDecimal 1.23456)
, _cellComment = Nothing
, _cellFormula = Nothing })]
sheets = [ ("Sheet1", def & wsCells .~ cellMap1) , ("Sheet2", def & wsCells .~ cellMap2) ]
Expand All @@ -476,7 +476,7 @@ testFormatWorkbook = formatWorkbook sheets minimalStyleSheet
sheetNames = ["Sheet1", "Sheet2"]
testFormattedCellMap1 = M.fromList [((1,1), (def & formattedCell . cellValue ?~ CellText "text at A1 Sheet1"))]

testFormattedCellMap2 = M.fromList [((2,3), (def & formattedCell . cellValue ?~ CellDouble 1.23456
testFormattedCellMap2 = M.fromList [((2,3), (def & formattedCell . cellValue ?~ CellDecimal 1.23456
& formattedFormat . formatNumberFormat ?~ (UserNumberFormat "DD.MM.YYYY")))]
sheets = zip sheetNames [testFormattedCellMap1, testFormattedCellMap2]

Expand Down
1 change: 1 addition & 0 deletions xlsx.cabal
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,7 @@ Library
, network-uri
, old-locale >= 1.0.0.5
, safe >= 0.3
, scientific
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about adding some reasonable bounds, e.g. >= 0.3.6 (the version where Hashable was fixed)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we'e only relying on

  • Fractional instance of Scientific,
  • the toRealFloat function

actually a quite low version bound could suffice. 0.3.0.0 appears to be the birth version of toRealFloat.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer the safer bound

, text >= 0.11.3.1
, time >= 1.4.0.1
, transformers >= 0.3.0.0
Expand Down