Adding String conversion for preservation of indentation format for Html

michaelgantman · May 2, 2018 · 3691572 · 3691572
1 parent 1b79232
commit 3691572
Show file tree

Hide file tree

Showing 4 changed files with 83 additions and 6 deletions.
diff --git a/README.md b/README.md
@@ -15,6 +15,12 @@ Another useful feature is parsing String to time interval. It parses Strings wit
 suffix (for example  string "38s" will be parsed as 38 seconds, "24m" - 24 minutes "4h" - 4 hours, "3d" - 3 days and "45"
 as 45 milliseconds.) This method may be very useful for parsing time interval properties such as timeouts or waiting
 periods from configuration files.<br>
+<p>
+ Also there is a feature that converts String to preserve indentation formatting for html without use of escape
+ characters. It converts a String in such a way that its spaces are not modified by HTML renderer i.e. it replaces
+ regular space characters with non-breaking spaces known as '&amp;nbsp;' but they look in your source as regular space
+ '  ' and not as '&amp;nbsp;' It also replaces new line character with '&lt;br&gt;'.
+</p>
 Also class
 StringUnicodeEncoderDecoder converts String into sequence of unicodes and vise-versa.<br> Finally WebUtils class provides
 a method for chunked reading of HttpRequest content. This could be useful when receiving large files from client on the
@@ -25,18 +31,18 @@ and work with version as early as 5 and up. This library is available on Maven C
         &ltdependency&gt;<br>
             &nbsp&ltgroupId&gt;com.github.michaelgantman&lt&#47;groupId&gt;<br>
             &nbsp&ltartifactId&gt;MgntUtils&lt&#47;artifactId&gt;<br>
-            &nbsp&ltversion&gt;1.1.0.3&lt&#47;version&gt;<br>
+            &nbsp&ltversion&gt;1.1.0.4&lt&#47;version&gt;<br>
         &lt&#47;dependency&gt;<br><br>
         &ltdependency&gt;<br>
             &nbsp&ltgroupId&gt;com.github.michaelgantman&lt&#47;groupId&gt;<br>
             &nbsp&ltartifactId&gt;MgntUtils&lt&#47;artifactId&gt;<br>
-            &nbsp&ltversion&gt;1.1.0.3&lt&#47;version&gt;<br>
+            &nbsp&ltversion&gt;1.1.0.4&lt&#47;version&gt;<br>
             &nbsp&ltclassifier&gt;javadoc&lt&#47;classifier&gt;<br>
         &lt&#47;dependency&gt;<br><br>
         &ltdependency&gt;<br>
             &nbsp&ltgroupId&gt;com.github.michaelgantman&lt&#47;groupId&gt;<br>
             &nbsp&ltartifactId&gt;MgntUtils&lt&#47;artifactId&gt;<br>
-            &nbsp&ltversion&gt;1.1.0.3&lt&#47;version&gt;<br>
+            &nbsp&ltversion&gt;1.1.0.4&lt&#47;version&gt;<br>
             &nbsp&ltclassifier&gt;sources&lt&#47;classifier&gt;<br>
         &lt&#47;dependency&gt;<br>
 </p>

diff --git a/pom.xml b/pom.xml
@@ -6,7 +6,7 @@
 
     <groupId>com.github.michaelgantman</groupId>
     <artifactId>MgntUtils</artifactId>
-    <version>1.1.0.3</version>
+    <version>1.1.0.4</version>
     <name>MgntUtils</name>
     <description>
         Set of various Utils: stacktrace noise filter, String to/from unicode sequence converter, Silent String parsing

diff --git a/src/main/java/com/mgnt/utils/TextUtils.java b/src/main/java/com/mgnt/utils/TextUtils.java
@@ -12,7 +12,7 @@
 
 /**
  * This class provides various utilities for work with String that represents some other type. In current version this class provides methods for
- * converting a String into its numeric value of various types (Integer, Float, Byte, Double, Long, Short). There are 2 methods for retrieving
+ * converting a String into its numeric value of various types (Integer, Float, Byte, Double, Long, Short). There are several methods for retrieving
  * Exception stacktrace as a String in full or shortened version. Shortened version of the stacktrace will contain concise information focusing on
  * specific package or subpackage while removing long parts of irrelevant stacktrace. This could be very useful for logging in web-based architecture
  * where stacktrace may contain long parts of server provided classes trace that could be eliminated with the methods of this class while retaining
@@ -34,6 +34,12 @@
  * periods from configuration files.
  * </p>
  * <p>
+ *  Also in this class there is a method that converts String to preserve indentation formatting for html without use of escape
+ *  characters. It converts a String in such a way that its spaces are not modified by HTML renderer i.e. it replaces
+ *  regular space characters with non-breaking spaces known as '&amp;nbsp;' but they look in your source as regular space
+ *  '  ' and not as '&amp;nbsp;' It also replaces new line character with '&lt;br&gt;'.
+ * </p>
+ * <p>
  *     Note that this class has a loose dependency on slf4J library. If in the project some other compatible logging library is present
  *     (such as Log4J) this class will still work without any ill effects
  * </p>
@@ -60,6 +66,8 @@ public class TextUtils {
     private static final String SUPPRESED_STAKTRACE_PREFIX = "Suppressed:";
     private static final String RELEVANT_PACKAGE_SYSTEM_EVIRONMENT_VARIABLE = "MGNT_RELEVANT_PACKAGE";
     private static final String RELEVANT_PACKAGE_SYSTEM_PROPERTY = "mgnt.relevant.package";
+    private static final String HTML_NON_BREAKING_SPACE_CHARACTER = StringUnicodeEncoderDecoder.decodeUnicodeSequenceToString("\\u00A0");
+    private static final String HTML_NEW_LINE = "<br>";
 
     static {
         initRelevantPackageFromSystemProperty();
@@ -712,6 +720,64 @@ public static void setRelevantPackage(String relevantPackage) {
         RELEVANT_PACKAGE = relevantPackage;
     }
 
+    /**
+     * This method converts a String in such a way that its spaces are not modified by HTML renderer i.e. it replaces
+     * regular space characters with non-breaking spaces known as '&amp;nbsp;' but they look in your source as regular
+     * space '&nbsp;&nbsp;' and not as '&amp;nbsp;' It also replaces new line character with '&lt;br&gt;'.
+     * Here is an example. Lets say that you would like to write a text that has indentations So you can not simply write
+     * something like
+     * <p><br>
+     *&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;This is non-indented line<br>
+     *&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;This is 2 spaces indented line<br>
+     *&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;This is 4 spaces indented line<br>
+     * </p><br>
+     * Such a text after rendered by html would result into single non-indented line:<br><br>
+     * <p>
+     * This is non-indented line This is 2 spaces indented line This is 4 spaces indented line
+     *
+     * </p><br>
+     * The solution would be to write your text as follows:
+     * <p>
+     *     <br>
+     *     This is non-indented line&lt;br&gt;<br>
+     *     &amp;nbsp;&amp;nbsp;This is 2 spaces indented line&lt;br&gt;<br>
+     *     &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;This is 4 spaces indented line<br>
+     * </p><br>
+     * That works just fine once rendered in HTML but your source now is not very readable and difficult to maintain
+     * if you want to modify your indentations. So in order to remedy this you can pass your original string to this
+     * method, say the string looks like this:
+     * <p><br>
+     *&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;This is non-indented line<br>
+     *&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;This is 2 spaces indented line<br>
+     *&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;This is 4 spaces indented line<br>
+     * </p><br>
+     * And it will return you the string that looks like this:
+     *<p><br>
+     *&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;This is non-indented line&lt;br&gt;<br>
+     *&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;This is 2 spaces indented line&lt;br&gt;<br>
+     *&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;This is 4 spaces indented line&lt;br&gt;<br>
+     *</p><br>
+     * Except, that besides visible addition of &lt;br&gt; at the end of the lines your regular spaces (U+0020) have been
+     * replaced with non-breaking spaces (U+00A0) but they look the same in your source. So if you just place this
+     * modified string into your HTML source code your indentation will be preserved and you source code is readable.
+     * <br><b>IMPORTANT NOTE:</b> if you want to modify indentations later you can NOT just type additional spaces.
+     * You will have to either use this method again after you modified your string or you can copy-pace those "normal
+     * looking" but actually non-breaking spaces. Also this was tested and found to be working with just regular HTML, but
+     * in combination with javascript it could sometimes produce unexpected results and instead of "normal-looking" space
+     * may show an 'Â' symbol. So, this method has its limitations. Test your results before you deliver. Use this method
+     * at your own risk.  &#x263a
+     * @param rawText to be converted
+     * @return String that is converted as described above
+     */
+    public static String formatStringToPreserveIndentationForHtml(String rawText) {
+        String result = rawText;
+        if(StringUtils.isNotEmpty(rawText)) {
+            result = rawText.replaceAll(" ", HTML_NON_BREAKING_SPACE_CHARACTER)
+                    .replaceAll("\n", HTML_NEW_LINE + "\n");
+        }
+        return result;
+    }
+
     private static void warn(String message, Throwable t) {
         if (RELEVANT_PACKAGE != null && !RELEVANT_PACKAGE.isEmpty()) {
             logger.warn(message + getStacktrace(t));

diff --git a/src/main/resources/overview.html b/src/main/resources/overview.html
@@ -20,7 +20,12 @@
 suffix (for example  string "38s" will be parsed as 38 seconds, "24m" - 24 minutes "4h" - 4 hours, "3d" - 3 days and "45"
 as 45 milliseconds.) This method may be very useful for parsing time interval properties such as timeouts or waiting
 periods from configuration files.
-<br>
+<p>
+    Also there is a feature that converts String to preserve indentation formatting for html without use of escape
+    characters. It converts a String in such a way that its spaces are not modified by HTML renderer i.e. it replaces
+    regular space characters with non-breaking spaces known as '&amp;nbsp;' but they look in your source as regular space
+    '  ' and not as '&amp;nbsp;' It also replaces new line character with '&lt;br&gt;'.
+</p><br>
 Also class
 StringUnicodeEncoderDecoder converts String into sequence of unicodes and vise-versa.<br> Finally WebUtils class provides
 a method for chunked reading of HttpRequest content. This could be useful when receiving large files from client on the