contact  |  about  |  sitemap


contact  |  about  |  sitemap

Trim HTML, Becomes boxes
Last Post 25 Mar 2010 02:32 PM by Tijn. 9 Replies.
Sort:
PrevPrev NextNext
Author Messages
Jonr

--
16 Feb 2010 05:07 PM
hi I have extracted data from a forum posting,

my original extraction contained html <br><br> it looked like it appeared after each new line or for a carriage return.,

i did a Trim HTML tags, it did clean the html tags but i now have small boxes (possible asci characters) in between each sentence where the HTMl used to be (or better, in between each new paragraph or carriage return).

How do I remove these markings from the extracted data..


Tijn

--
16 Feb 2010 07:31 PM
Do you have an example forum site, or script?



jonr

--
16 Feb 2010 08:02 PM
enclosed is a .jpg of both examples, without html trim and with html trim


jonr

--
16 Feb 2010 08:06 PM
receiving errors on upload, File Type Not Allowed, (preview does not show attachment)

Djuggler_Forum.djs

Tijn

--
17 Feb 2010 01:34 PM
Put the screenshot in a zip. Zip files are allowed.

The script you provided worked fine in both situations, Strip and Trim or a seperate action.
The boxes are most likely a Linefeed without a carriage return on the website. Normal practice is a CRLF combined. To indicate a LF in Djuggler you use #10#.

You could try a replace text action with #10# replaced by #13##10#.


jonr

--
17 Feb 2010 05:07 PM
attached is the .jpg

it shows it without the trim and with trim... i event downloaded the file i uploaded to this post and i still get the boxes (you mentioned it worked fine for you)


Djuggler_Trim_HTML_Error.zip

Tijn

--
18 Feb 2010 09:32 AM
Perhaps it's a setting on your computer so it can't display the complete character range.
You could try to install the unicode charset support, that will ensure you can view all unicode chars in Windows.

See http://www.microsoft.com/resources/...x?mfr=true



Stephan

--
25 Mar 2010 08:06 AM
Hi
I have the same problem.
I need to replace a CRLF in a text but it simply dont work.
I used replace text with #10# and #13# (see example) How can i fix this?


crlf_problem.djs

Stephan

--
25 Mar 2010 08:17 AM
Ok i fixed the problem with using
match and replace Regex
using \r\n as search text


Tijn

--
25 Mar 2010 02:32 PM
Great solution!




Quick Reply
toggle
  Username:
Subject:
Body:
Security Code:
Enter the code shown above:

Submit

Powered by Active Forums

Forum participation and optional registration

You don't need to be registered to partcipate in the Djuggler forums, however if you want to subscribe to email notifications you need to register. You can also subscribe to the forum RSS feed.

Forum participation and optional registration

You don't need to be registered to partcipate in the Djuggler forums, however if you want to subscribe to email notifications you need to register. You can also subscribe to the forum RSS feed.