Skip to content

Commit

Permalink
Ability to Transpose data in Shiftr.
Browse files Browse the repository at this point in the history
 Fixes #91 #108

 Added to the '@' wildcard functionality to allow for data lookup in the input tree.
 - Example :
   input : {
    "data" : {
      "key" : "penguins",
      "value" : "fish"
    }

   spec : {
     "data" : {
       "@value" : "@key"
     }
   }

   output : {
     "penguins" : "fish"
   }

   This will pull values from the "data" Map, from the "key" and "value" keys.
 - Complex paths after the '@' can be embedded in a parenthesis
   - Example : "@(path.to.data.down.the[2].tree.&(1,1))"
 - Note : The "path" after an '@' can not contain any '*' wildcards.

 - Cleaned up the Parsing of PathElements for Shiftr.
 - Refactored the WalkedPath so that it can contain a pointer to the input from each level

 - For Transpose, the ability to coerce a Boolean into a String of "true" or "false".
 - Ability to specify a String value on the LHS by using the # operator.
  -- Compelling usecase is transposing a boolean into more useful String values.

 Also, added to the '#' operator so that values from the spec can be written directly
  to the output.  This was possible before, but required two passes thru Shiftr.
  • Loading branch information
milo.simpson committed Oct 13, 2014
1 parent 7bf1651 commit 7d9735c
Show file tree
Hide file tree
Showing 56 changed files with 2,025 additions and 204 deletions.
41 changes: 35 additions & 6 deletions jolt-core/src/main/java/com/bazaarvoice/jolt/Shiftr.java
Original file line number Diff line number Diff line change
Expand Up @@ -258,11 +258,30 @@
* </pre>
*
* '#' Wildcard
* Valid only on the RHS of the spec, nested in an array, like "[#2]"
* This wildcard is useful if you want to take a JSON map and turn it into a JSON array, and you do not care about the order of the array.
* Valid both on the LHS and RHS, but has different behavior / format on either side.
* They way to think of it, is that it allows you to specify a "synthentic" value, aka a value not found in the input data.
*
* While Shiftr is doing its parallel tree walk of the input data and the spec, it tracks how many matched it has processed at each level
* of the spec tree.
* On the RHS of the spec, # is only valid in the the context of an array, like "[#2]".
* What "[#2]" means is, go up the three 2 levels and ask that node how many matches it has had, and then use that as an index
* in the arrays.
* This means that, while Shiftr is doing its parallel tree walk of the input data and the spec, it tracks how many matches it
* has processed at each level of the spec tree.
*
* This useful if you want to take a JSON map and turn it into a JSON array, and you do not care about the order of the array.
*
* On the LHS of the spec, # allows you to specify a hard coded String to be place as a value in the output.
*
* The initial use-case for this feature was to be able to process a Boolean input value, and if the value is
* boolean true write out the string "enabled". Note, this was possible before, but it required two Shiftr steps.
*
* <pre>
* Example
* "hidden" : {
* "true" : { // if the value of "hidden" is true
* "#disabled" : "clients.clientId" // write the word "disabled" to the path "clients.clientId"
* }
* }
* </pre>
*
*
* '|' Wildcard
Expand All @@ -278,7 +297,10 @@
*
*
* '@' Wildcard
* Valid only on the LHS of the spec.
* Valid only on both sides of the spec.
*
* The basic '@' on the LHS.
*
* This wildcard is necessary if you want to do put both the input value and the input key somewhere in the output JSON.
*
* Example '@' wildcard usage :
Expand All @@ -298,6 +320,13 @@
* </pre>
* Thus the '@' wildcard is the mean "copy the value of the data at this level in the tree, to the output".
*
* Advanced '@' sign wildcard.
* The format is lools like "@(3,title)", where
* "3" means go up the tree 3 levels and then lookup the key
* "title" and use the value at that key.
*
* See the filter*.json and transpose*.json Unit Test fixtures.
*
*
* JSON Arrays :
*
Expand Down Expand Up @@ -471,7 +500,7 @@ public Object transform( Object input ) {
// Create a root LiteralPathElement so that # is useful at the root level
LiteralPathElement rootLpe = new LiteralPathElement( ROOT_KEY );
WalkedPath walkedPath = new WalkedPath();
walkedPath.add( rootLpe );
walkedPath.add( input, rootLpe );

rootSpec.apply( ROOT_KEY, input, walkedPath, output );

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@ public boolean apply( String inputKey, Object input, WalkedPath walkedPath, Obje
return false;
}

walkedPath.add( thisLevel );
walkedPath.add( input, thisLevel );

// The specialChild can change the data object that I point to.
// Aka, my key had a value that was a List, and that gets changed so that my key points to a ONE value
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ public Object applyToParentContainer ( String inputKey, Object input, WalkedPath
private Object performCardinalityAdjustment( String inputKey, Object input, WalkedPath walkedPath, Map parentContainer, LiteralPathElement thisLevel ) {

// Add our the LiteralPathElement for this level, so that write path References can use it as &(0,0)
walkedPath.add( thisLevel );
walkedPath.add( input, thisLevel );

Object returnValue = null;
if ( cardinalityRelationship == CardinalityRelationship.MANY ) {
Expand Down
41 changes: 41 additions & 0 deletions jolt-core/src/main/java/com/bazaarvoice/jolt/common/PathStep.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
/*
* Copyright 2014 Bazaarvoice, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package com.bazaarvoice.jolt.common;

import com.bazaarvoice.jolt.common.pathelement.LiteralPathElement;

/**
* A tuple class that contains the data for one level of a
* tree walk, aka a reference to the input for that level, and
* the LiteralPathElement that was matched at that level.
*/
public final class PathStep {
private final Object treeRef;
private final LiteralPathElement literalPathElement;

public PathStep(Object treeRef, LiteralPathElement literalPathElement) {
this.treeRef = treeRef;
this.literalPathElement = literalPathElement;
}

public Object getTreeRef() {
return treeRef;
}

public LiteralPathElement getLiteralPathElement() {
return literalPathElement;
}
}
37 changes: 26 additions & 11 deletions jolt-core/src/main/java/com/bazaarvoice/jolt/common/WalkedPath.java
Original file line number Diff line number Diff line change
Expand Up @@ -23,39 +23,54 @@
/**
* DataStructure used by a SpecTransform during it's parallel tree walk.
*
* Basically this is Stack that records the steps down the tree that have been taken.
* For each level, there is a PathStep, which contains a pointer the data of that level,
* and a pointer to the LiteralPathElement matched at that level.
*
* At any given point in time, it represents where in the tree walk a Spec is operating.
* It is primarily used to by the ShiftrLeafSpec and CardinalityLeafSpec as a reference
* to lookup real values for output "&(1,1)" references.
*
* It is expected that as the SpecTransform navigates down the tree, LiteralElements will be added and then
* removed when that subtree has been walked.
*/
public class WalkedPath extends ArrayList<LiteralPathElement> {
public class WalkedPath extends ArrayList<PathStep> {

public WalkedPath() {
super();
}

public WalkedPath( Collection<LiteralPathElement> c ) {
super( c );
public WalkedPath(Collection<PathStep> c) {
super(c);
}

public LiteralPathElement removeLast() {
return remove( size() - 1 );
public WalkedPath( Object treeRef, LiteralPathElement literalPathElement ) {
super();
this.add( new PathStep( treeRef, literalPathElement ) );
}

/**
* Convenience method
*/
public boolean add( Object treeRef, LiteralPathElement literalPathElement ) {
return super.add( new PathStep( treeRef, literalPathElement ) );
}

public void removeLast() {
remove(size() - 1);
}

/**
* Method useful to "&", "&1", "&2", etc evaluation.
*/
public LiteralPathElement elementFromEnd( int idxFromEnd ) {
if ( isEmpty() ) {
public PathStep elementFromEnd(int idxFromEnd) {
if (isEmpty()) {
return null;
}
return get( size() - 1 - idxFromEnd );
return get(size() - 1 - idxFromEnd);
}

public LiteralPathElement lastElement() {
return get( size() - 1 );
public PathStep lastElement() {
return get(size() - 1);
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ public String evaluate( WalkedPath walkedPath ) {
}
else {
AmpReference ref = (AmpReference) token;
LiteralPathElement literalPathElement = walkedPath.elementFromEnd( ref.getPathIndex() );
LiteralPathElement literalPathElement = walkedPath.elementFromEnd( ref.getPathIndex() ).getLiteralPathElement();
String value = literalPathElement.getSubKeyRef( ref.getKeyGroup() );
output.append( value );
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -98,12 +98,12 @@ public String evaluate( WalkedPath walkedPath ) {
return arrayIndex;

case HASH:
LiteralPathElement element = walkedPath.elementFromEnd( ref.getPathIndex() );
LiteralPathElement element = walkedPath.elementFromEnd( ref.getPathIndex() ).getLiteralPathElement();
Integer index = element.getHashCount();
return index.toString();

case REFERENCE:
LiteralPathElement lpe = walkedPath.elementFromEnd( ref.getPathIndex() );
LiteralPathElement lpe = walkedPath.elementFromEnd( ref.getPathIndex() ).getLiteralPathElement();
String keyPart;

if ( ref instanceof PathAndGroupReference ) {
Expand All @@ -118,7 +118,7 @@ public String evaluate( WalkedPath walkedPath ) {
return keyPart;
}
catch ( NumberFormatException nfe ) {
throw new RuntimeException( " Evaluating canonical ReferencePathElement:" + this.getCanonicalForm() + ", and got a non integer result for reference:" + ref.getCanonicalForm() );
throw new RuntimeException( " Evaluating canonical ReferencePathElement:" + this.getCanonicalForm() + ", and got a non integer result:(" + keyPart + "), for reference:" + ref.getCanonicalForm() );
}
default:
throw new IllegalStateException( "ArrayPathType enum added two without updating this switch statement." );
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ public AtPathElement( String key ) {
}

public LiteralPathElement match( String dataKey, WalkedPath walkedPath ) {
return walkedPath.lastElement(); // copy what our parent was so that write keys of &0 and &1 both work.
return walkedPath.lastElement().getLiteralPathElement(); // copy what our parent was so that write keys of &0 and &1 both work.
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ public String getCanonicalForm() {

@Override
public String evaluate( WalkedPath walkedPath ) {
LiteralPathElement pe = walkedPath.elementFromEnd( dRef.getPathIndex() );
LiteralPathElement pe = walkedPath.elementFromEnd( dRef.getPathIndex() ).getLiteralPathElement();
return pe.getSubKeyRef( dRef.getKeyGroup() );
}

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
/*
* Copyright 2013 Bazaarvoice, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package com.bazaarvoice.jolt.common.pathelement;

import com.bazaarvoice.jolt.common.WalkedPath;
import com.bazaarvoice.jolt.exception.SpecException;
import com.bazaarvoice.jolt.utils.StringTools;

/**
* For use on the LHS, allows the user to specify an explicit string to write out.
* Aka given a input that is boolean, would want to write something out other than "true" / "false".
*/
public class HashPathElement extends BasePathElement implements MatchablePathElement {

private final String keyValue;

public HashPathElement( String key ) {
super(key);

if ( StringTools.isBlank( key ) ) {
throw new SpecException( "HashPathElement cannot have empty String as input." );
}

if ( ! key.startsWith( "#" ) ) {
throw new SpecException( "LHS # should start with a # : " + key );
}

if ( key.length() <= 1 ) {
throw new SpecException( "HashPathElement input is too short : " + key );
}


if ( key.charAt( 1 ) == '(' ) {
if ( key.charAt( key.length() -1 ) == ')' ) {
keyValue = key.substring( 2, key.length() -1 );
}
else {
throw new SpecException( "HashPathElement, mismatched parens : " + key );
}
}
else {
keyValue = key.substring( 1 );
}
}

@Override
public String getCanonicalForm() {
return "#(" + keyValue + ")";
}

@Override
public LiteralPathElement match( String dataKey, WalkedPath walkedPath ) {
return new LiteralPathElement( keyValue );
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ public String evaluate( WalkedPath walkedPath ) {
}

@Override
public com.bazaarvoice.jolt.common.pathelement.LiteralPathElement match( String dataKey, WalkedPath walkedPath ) {
public LiteralPathElement match( String dataKey, WalkedPath walkedPath ) {
return getRawKey().equals( dataKey ) ? this : null ;
}

Expand Down
Loading

0 comments on commit 7d9735c

Please sign in to comment.