Discussion:
Can we expect better GPU performance on braswell(Intel celeron N3160) than baytrail(Intel celeron J1900)?
(too old to reply)
ViruS Tadala
2016-12-15 15:19:13 UTC
Permalink
Hi,

I have done some benchmarking on both platforms & observed that braswell
did lesser number of channels(decode + re-encode using vaapi) than baytrail.

60-65% for 6channels on braswell.

60-66 % for 10channels on baytrail.

Please let me know whether we can expect better GPU performance on braswell
than baytrail or not?

Best Regards,

Veeranna.
Peter Frühberger
2016-12-15 16:47:59 UTC
Permalink
Hi Veeranna,

i cannot even interpret your figures. Is that the CPU load? If yes - did
you check the clocks?
Which benchmark did you use? Where did the data come from, where was it
encoded to? Was it stored? Which format?

In short: Without more details you most likely get the following answer:

Depends!

Best regards
Peter
Post by ViruS Tadala
Hi,
I have done some benchmarking on both platforms & observed that braswell
did lesser number of channels(decode + re-encode using vaapi) than baytrail.
60-65% for 6channels on braswell.
60-66 % for 10channels on baytrail.
Please let me know whether we can expect better GPU performance on
braswell than baytrail or not?
Best Regards,
Veeranna.
_______________________________________________
Libva mailing list
https://lists.freedesktop.org/mailman/listinfo/libva
ViruS Tadala
2016-12-15 17:03:01 UTC
Permalink
Hi Peter,

Thanks for your quick response.

Mentioned figures are GPU load values. I have ran the intel_gpu_top command
to get the same.

I didn't check any clock details. Used below pipeline and ran multiple
instances.

gst-launch-1.0 videotestsrc is-live=true pattern=22 horizontal-speed=20 !
'video/x-raw, format=(string)I420, width=(int)1920, height=(int)1080,
framerate=(fraction)10/1' ! vaapiencode_h264 min-qp=20 ! queue !
vaapidecode ! queue ! appsink

Please let me know if you need more information.

Best Regards,
Veeranna.
Post by Peter Frühberger
Hi Veeranna,
i cannot even interpret your figures. Is that the CPU load? If yes - did
you check the clocks?
Which benchmark did you use? Where did the data come from, where was it
encoded to? Was it stored? Which format?
Depends!
Best regards
Peter
On Thu, Dec 15, 2016 at 4:19 PM, ViruS Tadala <
Post by ViruS Tadala
Hi,
I have done some benchmarking on both platforms & observed that braswell
did lesser number of channels(decode + re-encode using vaapi) than baytrail.
60-65% for 6channels on braswell.
60-66 % for 10channels on baytrail.
Please let me know whether we can expect better GPU performance on
braswell than baytrail or not?
Best Regards,
Veeranna.
_______________________________________________
Libva mailing list
https://lists.freedesktop.org/mailman/listinfo/libva
--
Best Regards,
ViruS
Peter Frühberger
2016-12-15 19:14:20 UTC
Permalink
Hi Veeranna,
Post by ViruS Tadala
Hi Peter,
Thanks for your quick response.
Mentioned figures are GPU load values. I have ran the intel_gpu_top
command to get the same.
I didn't check any clock details. Used below pipeline and ran multiple
instances.
gst-launch-1.0 videotestsrc is-live=true pattern=22 horizontal-speed=20 !
'video/x-raw, format=(string)I420, width=(int)1920, height=(int)1080,
framerate=(fraction)10/1' ! vaapiencode_h264 min-qp=20 ! queue !
vaapidecode ! queue ! appsink
Please let me know if you need more information.
Best Regards,
Veeranna.
Post by Peter Frühberger
Hi Veeranna,
i cannot even interpret your figures. Is that the CPU load? If yes - did
you check the clocks?
Which benchmark did you use? Where did the data come from, where was it
encoded to? Was it stored? Which format?
Depends!
Best regards
Peter
On Thu, Dec 15, 2016 at 4:19 PM, ViruS Tadala <
Post by ViruS Tadala
Hi,
I have done some benchmarking on both platforms & observed that braswell
did lesser number of channels(decode + re-encode using vaapi) than baytrail.
60-65% for 6channels on braswell.
60-66 % for 10channels on baytrail.
Please let me know whether we can expect better GPU performance on
braswell than baytrail or not?
Best Regards,
Veeranna.
_______________________________________________
Libva mailing list
https://lists.freedesktop.org/mailman/listinfo/libva
--
Best Regards,
ViruS
Does this kernel patch help:


From 32c513eeafa681f798b04488524994fe47355cad Mon Sep 17 00:00:00 2001
From: fritsch <***@gmail.com>
Date: Sat, 13 Aug 2016 22:56:37 +0200
Subject: [PATCH] drm/i915: intel-pm enable thresholds

---
drivers/gpu/drm/i915/intel_pm.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 06e5596..38e613c 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4982,8 +4982,7 @@ static void valleyview_set_rps(struct
drm_i915_private *dev_priv, u8 val)

if (val != dev_priv->rps.cur_freq) {
vlv_punit_write(dev_priv, PUNIT_REG_GPU_FREQ_REQ, val);
- if (!IS_CHERRYVIEW(dev_priv))
- gen6_set_rps_thresholds(dev_priv, val);
+ gen6_set_rps_thresholds(dev_priv, val);
}

dev_priv->rps.cur_freq = val;
--
2.7.4



Best regards

Peter
Peter Frühberger
2016-12-15 19:20:14 UTC
Permalink
Something I have forgotten,
Post by Peter Frühberger
Hi Veeranna,
On Thu, Dec 15, 2016 at 6:03 PM, ViruS Tadala <
Post by ViruS Tadala
Hi Peter,
Thanks for your quick response.
Mentioned figures are GPU load values. I have ran the intel_gpu_top
command to get the same.
I didn't check any clock details. Used below pipeline and ran multiple
instances.
gst-launch-1.0 videotestsrc is-live=true pattern=22 horizontal-speed=20 !
'video/x-raw, format=(string)I420, width=(int)1920, height=(int)1080,
framerate=(fraction)10/1' ! vaapiencode_h264 min-qp=20 ! queue !
vaapidecode ! queue ! appsink
Please let me know if you need more information.
Best Regards,
Veeranna.
Post by Peter Frühberger
Hi Veeranna,
i cannot even interpret your figures. Is that the CPU load? If yes - did
you check the clocks?
Which benchmark did you use? Where did the data come from, where was it
encoded to? Was it stored? Which format?
Depends!
Best regards
Peter
On Thu, Dec 15, 2016 at 4:19 PM, ViruS Tadala <
Post by ViruS Tadala
Hi,
I have done some benchmarking on both platforms & observed that
braswell did lesser number of channels(decode + re-encode using vaapi) than
baytrail.
60-65% for 6channels on braswell.
60-66 % for 10channels on baytrail.
Please let me know whether we can expect better GPU performance on
braswell than baytrail or not?
Best Regards,
Veeranna.
_______________________________________________
Libva mailing list
https://lists.freedesktop.org/mailman/listinfo/libva
--
Best Regards,
ViruS
From 32c513eeafa681f798b04488524994fe47355cad Mon Sep 17 00:00:00 2001
Date: Sat, 13 Aug 2016 22:56:37 +0200
Subject: [PATCH] drm/i915: intel-pm enable thresholds
---
drivers/gpu/drm/i915/intel_pm.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 06e5596..38e613c 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4982,8 +4982,7 @@ static void valleyview_set_rps(struct drm_i915_private *dev_priv, u8 val)
if (val != dev_priv->rps.cur_freq) {
vlv_punit_write(dev_priv, PUNIT_REG_GPU_FREQ_REQ, val);
- if (!IS_CHERRYVIEW(dev_priv))
- gen6_set_rps_thresholds(dev_priv, val);
+ gen6_set_rps_thresholds(dev_priv, val);
}
dev_priv->rps.cur_freq = val;
--
2.7.4
Best regards
Peter
And also check this: /sys/class/drm/card0/gt_min_freq_mhz while
benchmarking, it might simply be that the gpu is not clocking correctly.
You can increase the min clock by doing: echo 600 >
/sys/class/drm/card0/gt_min_freq_mhz

Loading...